Squid Reverse Proxy and WebDAV caching

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Squid Reverse Proxy and WebDAV caching

Olivier MARCHETTA

Hello Squid Users,

 

I have configured a squid reverse proxy to access Microsoft SharePoint Online with the aim of caching the document libraries into the squid cache for a branch office.

But so far I can see the access log with the GET HTTP requests from the users but none will be stored into the cache.

Now there are several difficulties to cache the documents:

  1. Microsoft is using SSL (but I have configured SSL bumps)
  2. Files are tagged with the cache header no-cache or cache-private
  3. The WebDAV client is the Microsoft Windows 10 client.

Now I would like to know if it’s still doable or if I can just forget having this kind of configuration on squid, and move on to an alternate caching method (OneDrive sync client for example).

 

Thank you.

Regards,

Olivier Marchetta

 


_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Amos Jeffries
Administrator
On 25/08/17 04:16, Olivier MARCHETTA wrote:

> Hello Squid Users,
>
> I have configured a squid reverse proxy to access Microsoft SharePoint
> Online with the aim of caching the document libraries into the squid
> cache for a branch office.
>
> But so far I can see the access log with the GET HTTP requests from the
> users but none will be stored into the cache.
>
> Now there are several difficulties to cache the documents:
>
>  1. Microsoft is using SSL (but I have configured SSL bumps)
>  2. Files are tagged with the cache header no-cache or cache-private

'no-cache' actually means things *are* cacheable. Squid just has to
perform a quick check with the server before using them. Your logs
should contain REFRESH instead of HIT entries for these objects.

The 'private' objects are only usable for one client, so caching is not
useful. Latest Squid can cache them by configuring refresh_pattern
directive ignore-private. Then Squid will do the REFRESH for these as well.

Welcome to HTTP/1.1 where things can be neither HIT nor MISS. The
REFRESH means a server was involved, but the object delivered to the
client may be new or from cache and of vastly different size than the
refresh objects on the server connection.

IMPORTANT: do not configure ignore-private and ignore-must-revalidate
for the same objects. That will corrupt your proxies responses.


>  3. The WebDAV client is the Microsoft Windows 10 client.
>
> Now I would like to know if it’s still doable or if I can just forget
> having this kind of configuration on squid, and move on to an alternate
> caching method (OneDrive sync client for example).
>

If you have a current up-to-date Squid it is probably caching but
absence of the classical "HIT" tag being confusing.

If you are actively seeing MISS in the logs for these objects then we
will need the HTTP transaction headers to see what is going on. That can
be retrieved with a debug_options 11,2 trace.


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Olivier MARCHETTA
Hello Amos,

Thank you for your help.
I have probably misconfigured the refresh_pattern in my config file.
Below more information.
My squid conf file:

---------------------------------------------------------------------
http_port 10.10.10.10:3128
icp_port 0
digest_generation off
dns_v4_first on
pid_filename /var/run/squid/squid.pid
cache_effective_user squid
cache_effective_group proxy
error_default_language en
icon_directory /usr/local/etc/squid/icons
visible_hostname pfSense Firewall
cache_mgr [hidden email]
access_log /var/squid/logs/access.log
cache_log /var/squid/logs/cache.log
cache_store_log none
netdb_filename /var/squid/logs/netdb.state
pinger_enable on
pinger_program /usr/local/libexec/squid/pinger

logfile_rotate 7
debug_options rotate=7
shutdown_lifetime 3 seconds
# Allow local network(s) on interface(s)
acl localnet src  10.10.10.0/24
forwarded_for on
uri_whitespace strip

cache_mem 128 MB
maximum_object_size_in_memory 20 MB
memory_replacement_policy heap GDSF
cache_replacement_policy heap LFUDA
minimum_object_size 0 KB
maximum_object_size 20 MB
cache_dir ufs /var/squid/cache 300 16 256
offline_mode on
cache_swap_low 90
cache_swap_high 95
cache allow all
# Add any of your own refresh_pattern entries above these.
refresh_pattern ^ftp:    1440  20%  10080
refresh_pattern ^gopher:  1440  0%  1440
refresh_pattern -i (/cgi-bin/|\?) 0  0%  0
refresh_pattern .    0  20%  4320
refresh_pattern -i \.jpg$ 30 50% 4320 ignore-reload ignore-no-cache ignore-no-store ignore-private
refresh_pattern -i \.pdf$ 30 50% 4320 ignore-reload ignore-no-cache ignore-no-store ignore-private
refresh_pattern -i \.docx$ 30 50% 4320 ignore-reload ignore-no-cache ignore-no-store ignore-private

#Remote proxies

# Setup some default acls
# ACLs all, manager, localhost, and to_localhost are predefined.
acl allsrc src all
acl safeports port 21 70 80 210 280 443 488 563 591 631 777 901 4443 3128 3129 1025-65535
acl sslports port 443 563 4443
---------------------------------------------------------------------


The Squid access log:
---------------------------------------------------------------------
Date   IP   Status   Address   User   Destination
24.08.2017 12:42:18   10.10.10.100   TCP_MISS/200   https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/picture.jpg
24.08.2017 12:42:17   10.10.10.100   TCP_MISS/200   https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1.pdf
24.08.2017 12:42:16   10.10.10.100   TCP_MISS/200   https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1.docx
---------------------------------------------------------------------


The cache manager info:
---------------------------------------------------------------------
Cache information for squid:
   Hits as % of all requests:   5min: 0.0%, 60min: 0.0%
   Hits as % of bytes sent:   5min: 0.0%, 60min: 0.0%
   Memory hits as % of hit requests:   5min: 0.0%, 60min: 0.0%
   Disk hits as % of hit requests:   5min: 0.0%, 60min: 0.0%
   Storage Swap size:   0 KB
   Storage Swap capacity:    0.0% used, 100.0% free
   Storage Mem size:   216 KB
   Storage Mem capacity:    0.2% used, 99.8% free
   Mean Object Size:   0.00 KB
---------------------------------------------------------------------


Regards,
Olivier MARCHETTA

-----Original Message-----
From: squid-users [mailto:[hidden email]] On Behalf Of Amos Jeffries
Sent: Thursday, August 24, 2017 11:59 PM
To: [hidden email]
Subject: Re: [squid-users] Squid Reverse Proxy and WebDAV caching

On 25/08/17 04:16, Olivier MARCHETTA wrote:

> Hello Squid Users,
>
> I have configured a squid reverse proxy to access Microsoft SharePoint
> Online with the aim of caching the document libraries into the squid
> cache for a branch office.
>
> But so far I can see the access log with the GET HTTP requests from
> the users but none will be stored into the cache.
>
> Now there are several difficulties to cache the documents:
>
>  1. Microsoft is using SSL (but I have configured SSL bumps)  2. Files
> are tagged with the cache header no-cache or cache-private

'no-cache' actually means things *are* cacheable. Squid just has to perform a quick check with the server before using them. Your logs should contain REFRESH instead of HIT entries for these objects.

The 'private' objects are only usable for one client, so caching is not useful. Latest Squid can cache them by configuring refresh_pattern directive ignore-private. Then Squid will do the REFRESH for these as well.

Welcome to HTTP/1.1 where things can be neither HIT nor MISS. The REFRESH means a server was involved, but the object delivered to the client may be new or from cache and of vastly different size than the refresh objects on the server connection.

IMPORTANT: do not configure ignore-private and ignore-must-revalidate for the same objects. That will corrupt your proxies responses.


>  3. The WebDAV client is the Microsoft Windows 10 client.
>
> Now I would like to know if it’s still doable or if I can just forget
> having this kind of configuration on squid, and move on to an alternate
> caching method (OneDrive sync client for example).
>

If you have a current up-to-date Squid it is probably caching but
absence of the classical "HIT" tag being confusing.

If you are actively seeing MISS in the logs for these objects then we
will need the HTTP transaction headers to see what is going on. That can
be retrieved with a debug_options 11,2 trace.


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Amos Jeffries
Administrator
On 25/08/17 20:18, Olivier MARCHETTA wrote:
> Hello Amos,
>
> Thank you for your help.
> I have probably misconfigured the refresh_pattern in my config file.
> Below more information.
> My squid conf file:
>
> ---------------------------------------------------------------------
> http_port 10.10.10.10:3128

You said this was a reverse-proxy. This config file is for a
forward/explicit proxy.

A reverse-proxy with the role you stated earlier would be configured with:

   http_port 3128
   http_port 80 accel
   https_port 443 accel cert=.. key=...
   cache_peer tenant.sharepoint.com parent 80 0 originserver
   acl SP dstdomain tenant.sharepoint.com
   cache_peer_access tenant.sharepoint.com allow SP
   http_access allow SP


> icp_port 0
> digest_generation off
> dns_v4_first on
> pid_filename /var/run/squid/squid.pid
> cache_effective_user squid
> cache_effective_group proxy
> error_default_language en
> icon_directory /usr/local/etc/squid/icons
> visible_hostname pfSense Firewall

As the name of the directive above indicates it is supposed to be a
*hostname*. More specifically it is the publicly visible FQDN of the
Squid server. It will be used in error pages URLs for fetching the icons
etc.

"http://pfsense Firewall/" is a pretty funny URL for Squid.



> cache_mgr [hidden email]
> access_log /var/squid/logs/access.log
> cache_log /var/squid/logs/cache.log
> cache_store_log none
> netdb_filename /var/squid/logs/netdb.state
> pinger_enable on
> pinger_program /usr/local/libexec/squid/pinger
>
> logfile_rotate 7
> debug_options rotate=7
> shutdown_lifetime 3 seconds
> # Allow local network(s) on interface(s)
> acl localnet src  10.10.10.0/24
> forwarded_for on
> uri_whitespace strip
>
> cache_mem 128 MB
> maximum_object_size_in_memory 20 MB
> memory_replacement_policy heap GDSF
> cache_replacement_policy heap LFUDA
> minimum_object_size 0 KB
> maximum_object_size 20 MB
> cache_dir ufs /var/squid/cache 300 16 256
> offline_mode on
> cache_swap_low 90
> cache_swap_high 95
> cache allow all

NP: its pretty pointless to configure things to their default values.
You can simplify your config quite a lot by removing many of the above
lines.

> # Add any of your own refresh_pattern entries above these.

Please re-read the above sentence from your squid.conf.

Order is important. <https://wiki.squid-cache.org/SquidFaq/OrderIsImportant>

> refresh_pattern ^ftp:    1440  20%  10080
> refresh_pattern ^gopher:  1440  0%  1440
> refresh_pattern -i (/cgi-bin/|\?) 0  0%  0
> refresh_pattern .    0  20%  4320
> refresh_pattern -i \.jpg$ 30 50% 4320 ignore-reload ignore-no-cache ignore-no-store ignore-private
> refresh_pattern -i \.pdf$ 30 50% 4320 ignore-reload ignore-no-cache ignore-no-store ignore-private
> refresh_pattern -i \.docx$ 30 50% 4320 ignore-reload ignore-no-cache ignore-no-store ignore-private


Also,

>
> #Remote proxies
>
> # Setup some default acls
> # ACLs all, manager, localhost, and to_localhost are predefined.
> acl allsrc src all

I suggest you double-check anywhere you are using the "allsrc" ACL. If
it is not explicitly being used as a name to attach a deny_info to then
it is a pointless waste of memory to redefine like this - just use the
built-in 'all' ACL name.


> acl safeports port 21 70 80 210 280 443 488 563 591 631 777 901 4443 3128 3129 1025-65535

NP: with the 1025-65535 set of ports listed you don't need to have
explicit entries for those ports higher than 1025.

Also, since this was apparently a reverse-proxy for HTTP and the log
seems to show HTTPS as well - it will not be receiving any of those
ports on URLs other than 80 and 443.


> acl sslports port 443 563 4443
> ---------------------------------------------------------------------
>
>
> The Squid access log:
> ---------------------------------------------------------------------
> Date   IP   Status   Address   User   Destination
> 24.08.2017 12:42:18   10.10.10.100   TCP_MISS/200   https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/picture.jpg
> 24.08.2017 12:42:17   10.10.10.100   TCP_MISS/200   https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1.pdf
> 24.08.2017 12:42:16   10.10.10.100   TCP_MISS/200   https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1.docx
> ---------------------------------------------------------------------
>
>
> The cache manager info:
> ---------------------------------------------------------------------
> Cache information for squid:
>     Hits as % of all requests:   5min: 0.0%, 60min: 0.0%
>     Hits as % of bytes sent:   5min: 0.0%, 60min: 0.0%
>     Memory hits as % of hit requests:   5min: 0.0%, 60min: 0.0%
>     Disk hits as % of hit requests:   5min: 0.0%, 60min: 0.0%
>     Storage Swap size:   0 KB
>     Storage Swap capacity:    0.0% used, 100.0% free
>     Storage Mem size:   216 KB
>     Storage Mem capacity:    0.2% used, 99.8% free
>     Mean Object Size:   0.00 KB
> ---------------------------------------------------------------------
>

Okay, not much caching. You got that debug trace?

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Olivier MARCHETTA
Hello,

Finally Squid is caching my SharePoint online documents.
But it doesn't work yet.
If I enable offline mode, the WebDAV client will not be able to download documents from the cache.
And I will see the following errors in the log:

---------------------------------------------------------------------------------
TCP_OFFLINE_HIT_ABORTED/000 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1%20-%20Copy%20-%20Copy%20-%20Copy%20-%20Copy.docx
TCP_OFFLINE_HIT_ABORTED/000 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1%20-%20Copy%20-%20Copy%20-%20Copy%20-%20Copy.docx
---------------------------------------------------------------------------------

If I disable offline mode, then nothing gets downloaded from the cache.

I have removed all ACL control from the squid conf (to make it easier for now).
I have replaced all refresh patterns by customs one (that I've found on Internet from another SharePoint caching project).

Sorry for the long file below, but I am posting my conf file again.
I don't know why the Squid cache is aborting the cache HIT.
If you have any clue, it would be very welcome.


---------------------------------------------------------------------------------
http_port 92.222.209.108:3128
icp_port 0
digest_generation off
dns_v4_first on
pid_filename /var/run/squid/squid.pid
cache_effective_user squid
cache_effective_group proxy
error_default_language en
icon_directory /usr/local/etc/squid/icons
visible_hostname sv-1101-wvp01.virtualdesk.cloud
cache_mgr [hidden email]
access_log /var/squid/logs/access.log
cache_log /var/squid/logs/cache.log
cache_store_log none
netdb_filename /var/squid/logs/netdb.state
pinger_enable on
pinger_program /usr/local/libexec/squid/pinger

logfile_rotate 7
debug_options rotate=7
shutdown_lifetime 3 seconds
# Allow local network(s) on interface(s)
acl localnet src  92.222.209.0/24
forwarded_for on
uri_whitespace strip


cache_mem 128 MB
maximum_object_size_in_memory 512 KB
memory_replacement_policy heap GDSF
cache_replacement_policy heap LFUDA
minimum_object_size 0 KB
maximum_object_size 20 MB
cache_dir ufs /var/squid/cache 100 16 256
offline_mode off
cache_swap_low 90
cache_swap_high 95
cache allow all

# Cache documents regardless what the server says
refresh_pattern .jpg 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
refresh_pattern .gif 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
refresh_pattern .png 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
refresh_pattern .txt 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
refresh_pattern .doc 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
refresh_pattern .docx 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
refresh_pattern .xls 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
refresh_pattern .xlsx 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
refresh_pattern .pdf 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth

# Setup acls
acl allsrc src all
http_access allow all

request_body_max_size 0 KB
delay_pools 1
delay_class 1 2
delay_parameters 1 -1/-1 -1/-1
delay_initial_bucket_level 100
delay_access 1 allow allsrc

# Reverse Proxy settings
https_port 92.222.209.108:443 accel cert=/usr/local/etc/squid/599eae0080989.crt key=/usr/local/etc/squid/599eae0080989.key
cache_peer olicomp.sharepoint.com parent 443 0 no-query no-digest originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER front-end-https=auto name=rvp_sharepoint
deny_info TCP_RESET allsrc
---------------------------------------------------------------------------------

Regards,
Olivier MARCHETTA
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Amos Jeffries
Administrator
On 26/08/17 00:49, Olivier MARCHETTA wrote:
> Hello,
>
> Finally Squid is caching my SharePoint online documents.
> But it doesn't work yet.
> If I enable offline mode, the WebDAV client will not be able to download documents from the cache.

That directive was designed for HTTP/1.0 behaviours and only works for
objects with optional revalidation. When the server delegates caching
freshness decision to the proxy.

When it is applied to content with mandatory revalidation; such as
anything with no-cache, private, no-store, must-revalidate directives in
HTTP/1.1 traffic.

The result is that things are prohibited from being delivered AND
prohibited from being updated.


> And I will see the following errors in the log:
>
> ---------------------------------------------------------------------------------
> TCP_OFFLINE_HIT_ABORTED/000 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1%20-%20Copy%20-%20Copy%20-%20Copy%20-%20Copy.docx
> TCP_OFFLINE_HIT_ABORTED/000 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1%20-%20Copy%20-%20Copy%20-%20Copy%20-%20Copy.docx
> ---------------------------------------------------------------------------------
>

Squid was simply not able to deliver anything to this client, not even
an error message for some reason.

It might be bugs in Squid preventing it generating an error page
(ABORTED with 5xx status). But usually ABORTED/000 means the client was
the one aborting / disconnecting before any HTTP response at all could
be delivered.


> If I disable offline mode, then nothing gets downloaded from the cache.

How are you determining that?

What I can see in the info so far provided is that Squid *is* finding
cached content to work with.


>
> I have removed all ACL control from the squid conf (to make it easier for now).
> I have replaced all refresh patterns by customs one (that I've found on Internet from another SharePoint caching project).
>
> Sorry for the long file below, but I am posting my conf file again.
> I don't know why the Squid cache is aborting the cache HIT.

You are forcing Squid to cache things that are marked as non-cacheable
because they contain client-specific security or privacy details. Since
the proxy is unable to determine for itself (on these objects) what
details go to which client caching these things can only be done with
revalidation before HIT delivery.

Then you are also configuring Squid to be forbidden to revalidate
anything at all.


I suspect we have a bug somewhere in Squid that makes it do the
ABORT/000, it should be doing a forced-MISS or a 5xx error with your
config. But that is not what you are needing to happen anyhow, so fixing
that particular bug wont help you.


> If you have any clue, it would be very welcome.
>
>  
> ---------------------------------------------------------------------------------
> http_port 92.222.209.108:3128
> icp_port 0
> digest_generation off
> dns_v4_first on
> pid_filename /var/run/squid/squid.pid
> cache_effective_user squid
> cache_effective_group proxy
> error_default_language en
> icon_directory /usr/local/etc/squid/icons
> visible_hostname sv-1101-wvp01.virtualdesk.cloud
> cache_mgr [hidden email]
> access_log /var/squid/logs/access.log
> cache_log /var/squid/logs/cache.log
> cache_store_log none
> netdb_filename /var/squid/logs/netdb.state
> pinger_enable on
> pinger_program /usr/local/libexec/squid/pinger
>
> logfile_rotate 7
> debug_options rotate=7
> shutdown_lifetime 3 seconds
> # Allow local network(s) on interface(s)
> acl localnet src  92.222.209.0/24
> forwarded_for on
> uri_whitespace strip
>
>
> cache_mem 128 MB
> maximum_object_size_in_memory 512 KB
> memory_replacement_policy heap GDSF
> cache_replacement_policy heap LFUDA
> minimum_object_size 0 KB
> maximum_object_size 20 MB
> cache_dir ufs /var/squid/cache 100 16 256
> offline_mode off
> cache_swap_low 90
> cache_swap_high 95
> cache allow all
>
> # Cache documents regardless what the server says
> refresh_pattern .jpg 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .gif 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .png 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .txt 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .doc 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .docx 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .xls 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .xlsx 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .pdf 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
>


The normal refresh_pattern lines should stay. Just be down here
following your custom ones. At minimum the cgi-bin and '.' patterns are
necessary for correct handling of dynamic content in the cache.

[ Sorry I pressed send by accident earlier before completing that
"Also," statement which was intended to say that. ]


* The ignore-no-cache option was removed from Squid some versions ago.
As I mentioned earlier CC:no-cache actually means things *are* cacheable
in HTTP/1.1, so the directives intended effect is met by current Squids
default behaviour.


* The 50% only means +50% of the objects current age. Which can be very
short for frequently or recently updated objects. Percentages over 100%
are possible here, and usually necessary for good caching times.

* override-lastmod was useful once to avoid bugs (and side-effects from
misconfigured percentages mentioned above). But current Squid can figure
out Last-Modified values from Dates and timestamps as needed. So the
option is rarely necessary and more often than not actually causes worse
caching in by prohibiting Squid from doing heuristic freshness calculations
  YMMV so testing for your specific traffic is needed before use of this
option in current Squid.
  --> and remember how I mentioned offline_mode only works when the
proxy is delegated the freshness calculations? this prohibits Squid from
doing that calculation and uses the admin 14400 minute value instead.


* "reload-into-ims ignore-reload" these two options are mutually
exclusive. Changing a reload header value and ignoring it cannot be done
simultaneously. Pick one:

  ignore-reload - completely ignore the client indication that it needs
the latest data. Note that this is redundant with what offline_mode
does, but far more selective about what URLs it happens for.

  reload-into-ims - ask the server if any changes have happened, so the
cached content can be delivered if none instead of a full re-fetch.


* Since all of these lines are identical except the regex pattern for
URLs they apply to. You would save a lot more CPU cycles by combining
the regex into one pattern and only having one config line for the lot.

  refresh_pattern \.(jpg|gif|png|txt|docx?|xlsx?pdf) 14400 50% 18000 \
    override-expire reload-into-ims ignore-private ignore-auth



* ignore-auth - I would also check the actual response headers from the
server before using this option. While authentication credentials
normally means non-cacheable in HTTP/1.0 traffic in HTTP/1.1 they mean
mandatory revalidation in most cases and sometimes are irrelevant.
  What this option actually does is exclude special handling when auth
headers are present - it actively *prevents* some HTTP/1.1 traffic being
HIT on, when the special conditions were saying auth was cacheable or
irrelevant.


> # Setup acls
> acl allsrc src all
> http_access allow all
>
> request_body_max_size 0 KB
> delay_pools 1
> delay_class 1 2
> delay_parameters 1 -1/-1 -1/-1
> delay_initial_bucket_level 100
> delay_access 1 allow allsrc

These delay_parameters are doing nothing but wasting a surprisingly
large amount of CPU time and memory for calculating traffic numbers and
repeatedly pausing transactions for 0 milliseconds.


>
> # Reverse Proxy settings
> https_port 92.222.209.108:443 accel cert=/usr/local/etc/squid/599eae0080989.crt key=/usr/local/etc/squid/599eae0080989.key
> cache_peer olicomp.sharepoint.com parent 443 0 no-query no-digest originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER front-end-https=auto name=rvp_sharepoint

Avoid DONT_VERIFY_PEER like a plague. Find out the CA(s) which sign the
peer's certs and configure Squid to trust only the right CA for these
peer links, then add the NO_DEFAULT_CA flag. Even if it is one of the
normal global CA.

That will prevent unapproved MITM on your upstream traffic and help
detect traffic loops if the DNS+Squid config gets wonky.


> deny_info TCP_RESET allsrc

This deny_info is explicitly configuring Squid to send a TCP_RESET (aka
ABORTED/000) when ACL "allsrc" is the reason for transaction denial.

With your access control rules removed it should not be having an
effect, but beware of the above when you reinstate those rules.

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Olivier MARCHETTA
Hello Amos,

Thank you for your answer.
I have applied the configuration updates you recommended.
My squid config file is more simple now.
But unfortunately, I can see the cache filling itselt, but not being hit.

Here's the internal manager info log:
------------------------------------------------------------------------------
Cache information for squid:
        Hits as % of all requests: 5min: 0.0%, 60min: 0.0%
        Hits as % of bytes sent: 5min: 0.4%, 60min: 0.4%
        Memory hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
        Disk hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
        Storage Swap size: 70752 KB
        Storage Swap capacity: 69.1% used, 30.9% free
        Storage Mem size: 216 KB
        Storage Mem capacity: 0.2% used, 99.8% free
        Mean Object Size: 1768.80 KB
        Requests given to unlinkd: 56
------------------------------------------------------------------------------

As you can see, the cache swap capacity has been filled at 70%.
But when I try to get the same content from another computer, the transfer is slow and the HIT percentage stay at 0.

I have noticed the following errors in the Cache.log:
Could not parse headers from on disk object

I don't really know what to do next. I do not understand what this error means.

My squid.conf file below:

------------------------------------------------------------------------------
# This file is automatically generated by pfSense
# Do not edit manually !

http_port 10.10.10.10.108:3128
icp_port 0
digest_generation off
dns_v4_first on
pid_filename /var/run/squid/squid.pid
cache_effective_user squid
cache_effective_group proxy
error_default_language en
icon_directory /usr/local/etc/squid/icons
visible_hostname pfSense Firewall
cache_mgr [hidden email]
access_log /var/squid/logs/access.log
cache_log /var/squid/logs/cache.log
cache_store_log none
netdb_filename /var/squid/logs/netdb.state
pinger_enable on
pinger_program /usr/local/libexec/squid/pinger

logfile_rotate 7
debug_options rotate=7
shutdown_lifetime 3 seconds
forwarded_for on
uri_whitespace strip

cache_mem 128 MB
maximum_object_size_in_memory 256 KB
memory_replacement_policy heap GDSF
cache_replacement_policy heap LFUDA
minimum_object_size 0 KB
maximum_object_size 20 MB
cache_dir ufs /var/squid/cache 100 16 256
offline_mode off
cache_swap_low 90
cache_swap_high 95
cache allow all

refresh_pattern \.(jpg|gif|png|txt|docx|xlsx|pdf) 60 90% 600 override-expire reload-into-ims ignore-private

# Add any of your own refresh_pattern entries above these.
refresh_pattern ^ftp:    1440  20%  10080
refresh_pattern ^gopher:  1440  0%  1440
refresh_pattern -i (/cgi-bin/|\?) 0  0%  0
refresh_pattern .    0  20%  4320

# Reverse Proxy settings
https_port 10.10.10.10:443 accel cert=/usr/local/etc/squid/599eae0080989.crt key=/usr/local/etc/squid/599eae0080989.key
#
cache_peer tenant.sharepoint.com parent 443 0 originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER

acl allsrc src all
http_access allow all
------------------------------------------------------------------------------

PS: I will keep the DON’T_VERIFY_PEER for testing but I understand this is a security risk.
Thanks.

Regards,
Olivier MARCHETTA


-----Original Message-----
From: squid-users [mailto:[hidden email]] On Behalf Of Amos Jeffries
Sent: Saturday, August 26, 2017 5:21 AM
To: [hidden email]
Subject: Re: [squid-users] Squid Reverse Proxy and WebDAV caching

On 26/08/17 00:49, Olivier MARCHETTA wrote:
> Hello,
>
> Finally Squid is caching my SharePoint online documents.
> But it doesn't work yet.
> If I enable offline mode, the WebDAV client will not be able to download documents from the cache.

That directive was designed for HTTP/1.0 behaviours and only works for objects with optional revalidation. When the server delegates caching freshness decision to the proxy.

When it is applied to content with mandatory revalidation; such as anything with no-cache, private, no-store, must-revalidate directives in
HTTP/1.1 traffic.

The result is that things are prohibited from being delivered AND prohibited from being updated.


> And I will see the following errors in the log:
>
> ---------------------------------------------------------------------------------
> TCP_OFFLINE_HIT_ABORTED/000 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1%20-%20Copy%20-%20Copy%20-%20Copy%20-%20Copy.docx
> TCP_OFFLINE_HIT_ABORTED/000 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1%20-%20Copy%20-%20Copy%20-%20Copy%20-%20Copy.docx
> ---------------------------------------------------------------------------------
>

Squid was simply not able to deliver anything to this client, not even
an error message for some reason.

It might be bugs in Squid preventing it generating an error page
(ABORTED with 5xx status). But usually ABORTED/000 means the client was
the one aborting / disconnecting before any HTTP response at all could
be delivered.


> If I disable offline mode, then nothing gets downloaded from the cache.

How are you determining that?

What I can see in the info so far provided is that Squid *is* finding
cached content to work with.


>
> I have removed all ACL control from the squid conf (to make it easier for now).
> I have replaced all refresh patterns by customs one (that I've found on Internet from another SharePoint caching project).
>
> Sorry for the long file below, but I am posting my conf file again.
> I don't know why the Squid cache is aborting the cache HIT.

You are forcing Squid to cache things that are marked as non-cacheable
because they contain client-specific security or privacy details. Since
the proxy is unable to determine for itself (on these objects) what
details go to which client caching these things can only be done with
revalidation before HIT delivery.

Then you are also configuring Squid to be forbidden to revalidate
anything at all.


I suspect we have a bug somewhere in Squid that makes it do the
ABORT/000, it should be doing a forced-MISS or a 5xx error with your
config. But that is not what you are needing to happen anyhow, so fixing
that particular bug wont help you.


> If you have any clue, it would be very welcome.
>
>  
> ---------------------------------------------------------------------------------
> http_port 92.222.209.108:3128
> icp_port 0
> digest_generation off
> dns_v4_first on
> pid_filename /var/run/squid/squid.pid
> cache_effective_user squid
> cache_effective_group proxy
> error_default_language en
> icon_directory /usr/local/etc/squid/icons
> visible_hostname sv-1101-wvp01.virtualdesk.cloud
> cache_mgr [hidden email]
> access_log /var/squid/logs/access.log
> cache_log /var/squid/logs/cache.log
> cache_store_log none
> netdb_filename /var/squid/logs/netdb.state
> pinger_enable on
> pinger_program /usr/local/libexec/squid/pinger
>
> logfile_rotate 7
> debug_options rotate=7
> shutdown_lifetime 3 seconds
> # Allow local network(s) on interface(s)
> acl localnet src  92.222.209.0/24
> forwarded_for on
> uri_whitespace strip
>
>
> cache_mem 128 MB
> maximum_object_size_in_memory 512 KB
> memory_replacement_policy heap GDSF
> cache_replacement_policy heap LFUDA
> minimum_object_size 0 KB
> maximum_object_size 20 MB
> cache_dir ufs /var/squid/cache 100 16 256
> offline_mode off
> cache_swap_low 90
> cache_swap_high 95
> cache allow all
>
> # Cache documents regardless what the server says
> refresh_pattern .jpg 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .gif 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .png 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .txt 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .doc 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .docx 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .xls 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .xlsx 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .pdf 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
>


The normal refresh_pattern lines should stay. Just be down here
following your custom ones. At minimum the cgi-bin and '.' patterns are
necessary for correct handling of dynamic content in the cache.

[ Sorry I pressed send by accident earlier before completing that
"Also," statement which was intended to say that. ]


* The ignore-no-cache option was removed from Squid some versions ago.
As I mentioned earlier CC:no-cache actually means things *are* cacheable
in HTTP/1.1, so the directives intended effect is met by current Squids
default behaviour.


* The 50% only means +50% of the objects current age. Which can be very
short for frequently or recently updated objects. Percentages over 100%
are possible here, and usually necessary for good caching times.

* override-lastmod was useful once to avoid bugs (and side-effects from
misconfigured percentages mentioned above). But current Squid can figure
out Last-Modified values from Dates and timestamps as needed. So the
option is rarely necessary and more often than not actually causes worse
caching in by prohibiting Squid from doing heuristic freshness calculations
  YMMV so testing for your specific traffic is needed before use of this
option in current Squid.
  --> and remember how I mentioned offline_mode only works when the
proxy is delegated the freshness calculations? this prohibits Squid from
doing that calculation and uses the admin 14400 minute value instead.


* "reload-into-ims ignore-reload" these two options are mutually
exclusive. Changing a reload header value and ignoring it cannot be done
simultaneously. Pick one:

  ignore-reload - completely ignore the client indication that it needs
the latest data. Note that this is redundant with what offline_mode
does, but far more selective about what URLs it happens for.

  reload-into-ims - ask the server if any changes have happened, so the
cached content can be delivered if none instead of a full re-fetch.


* Since all of these lines are identical except the regex pattern for
URLs they apply to. You would save a lot more CPU cycles by combining
the regex into one pattern and only having one config line for the lot.

  refresh_pattern \.(jpg|gif|png|txt|docx?|xlsx?pdf) 14400 50% 18000 \
    override-expire reload-into-ims ignore-private ignore-auth



* ignore-auth - I would also check the actual response headers from the
server before using this option. While authentication credentials
normally means non-cacheable in HTTP/1.0 traffic in HTTP/1.1 they mean
mandatory revalidation in most cases and sometimes are irrelevant.
  What this option actually does is exclude special handling when auth
headers are present - it actively *prevents* some HTTP/1.1 traffic being
HIT on, when the special conditions were saying auth was cacheable or
irrelevant.


> # Setup acls
> acl allsrc src all
> http_access allow all
>
> request_body_max_size 0 KB
> delay_pools 1
> delay_class 1 2
> delay_parameters 1 -1/-1 -1/-1
> delay_initial_bucket_level 100
> delay_access 1 allow allsrc

These delay_parameters are doing nothing but wasting a surprisingly
large amount of CPU time and memory for calculating traffic numbers and
repeatedly pausing transactions for 0 milliseconds.


>
> # Reverse Proxy settings
> https_port 92.222.209.108:443 accel cert=/usr/local/etc/squid/599eae0080989.crt key=/usr/local/etc/squid/599eae0080989.key
> cache_peer olicomp.sharepoint.com parent 443 0 no-query no-digest originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER front-end-https=auto name=rvp_sharepoint

Avoid DONT_VERIFY_PEER like a plague. Find out the CA(s) which sign the
peer's certs and configure Squid to trust only the right CA for these
peer links, then add the NO_DEFAULT_CA flag. Even if it is one of the
normal global CA.

That will prevent unapproved MITM on your upstream traffic and help
detect traffic loops if the DNS+Squid config gets wonky.


> deny_info TCP_RESET allsrc

This deny_info is explicitly configuring Squid to send a TCP_RESET (aka
ABORTED/000) when ACL "allsrc" is the reason for transaction denial.

With your access control rules removed it should not be having an
effect, but beware of the above when you reinstate those rules.

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Olivier MARCHETTA
Hello again,

I have quickly setup a Squid version 3.5.26 on Windows and with a minimalist config file:

-------------------------------------------------------------------
acl allsrc src all
http_access allow allsrc
http_port 3128
cache_dir ufs /cygdrive/c/squidcache 100 16 256
coredump_dir /var/cache/squid
refresh_pattern \.(jpg|gif|png|txt|docx|xlsx|pdf) 60 90% 600 override-expire reload-into-ims ignore-private
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern . 0 20% 4320
# Reverse Proxy settings
https_port 10.10.10.10:443 accel defaultsite=olicomp.sharepoint.com cert=/cygdrive/c/squidssl/sharepoint.com.crt key=/cygdrive/c/squidssl/sharepoint.com.key
cache_peer 123.123.123.123 parent 443 0 originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER
-------------------------------------------------------------------

Same result. I can see the cache disk folder being filled up.
But the HIT will never happen.
Latency stay high, speed stay low on document access / file transfer.
Can Microsoft SharePoint Servers in Office 365 block any caching attempt ?
Also, I am using a WebDAV client. Is WebDAV supported ?

-------------------------------------------------------------------
Cache information for squid:
Hits as % of all requests: 5min: 0.0%, 60min: 0.0%
Hits as % of bytes sent: 5min: 0.0%, 60min: 0.0%
Memory hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
-------------------------------------------------------------------

Regards,
Olivier MARCHETTA


-----Original Message-----
From: Olivier MARCHETTA
Sent: Tuesday, August 29, 2017 2:47 PM
To: 'Amos Jeffries' <[hidden email]>; [hidden email]
Subject: RE: [squid-users] Squid Reverse Proxy and WebDAV caching

Hello Amos,

Thank you for your answer.
I have applied the configuration updates you recommended.
My squid config file is more simple now.
But unfortunately, I can see the cache filling itselt, but not being hit.

Here's the internal manager info log:
------------------------------------------------------------------------------
Cache information for squid:
        Hits as % of all requests: 5min: 0.0%, 60min: 0.0%
        Hits as % of bytes sent: 5min: 0.4%, 60min: 0.4%
        Memory hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
        Disk hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
        Storage Swap size: 70752 KB
        Storage Swap capacity: 69.1% used, 30.9% free
        Storage Mem size: 216 KB
        Storage Mem capacity: 0.2% used, 99.8% free
        Mean Object Size: 1768.80 KB
        Requests given to unlinkd: 56
------------------------------------------------------------------------------

As you can see, the cache swap capacity has been filled at 70%.
But when I try to get the same content from another computer, the transfer is slow and the HIT percentage stay at 0.

I have noticed the following errors in the Cache.log:
Could not parse headers from on disk object

I don't really know what to do next. I do not understand what this error means.

My squid.conf file below:

------------------------------------------------------------------------------
# This file is automatically generated by pfSense # Do not edit manually !

http_port 10.10.10.10.108:3128
icp_port 0
digest_generation off
dns_v4_first on
pid_filename /var/run/squid/squid.pid
cache_effective_user squid
cache_effective_group proxy
error_default_language en
icon_directory /usr/local/etc/squid/icons visible_hostname pfSense Firewall cache_mgr [hidden email] access_log /var/squid/logs/access.log cache_log /var/squid/logs/cache.log cache_store_log none netdb_filename /var/squid/logs/netdb.state pinger_enable on pinger_program /usr/local/libexec/squid/pinger

logfile_rotate 7
debug_options rotate=7
shutdown_lifetime 3 seconds
forwarded_for on
uri_whitespace strip

cache_mem 128 MB
maximum_object_size_in_memory 256 KB
memory_replacement_policy heap GDSF
cache_replacement_policy heap LFUDA
minimum_object_size 0 KB
maximum_object_size 20 MB
cache_dir ufs /var/squid/cache 100 16 256 offline_mode off cache_swap_low 90 cache_swap_high 95 cache allow all

refresh_pattern \.(jpg|gif|png|txt|docx|xlsx|pdf) 60 90% 600 override-expire reload-into-ims ignore-private

# Add any of your own refresh_pattern entries above these.
refresh_pattern ^ftp:    1440  20%  10080
refresh_pattern ^gopher:  1440  0%  1440 refresh_pattern -i (/cgi-bin/|\?) 0  0%  0
refresh_pattern .    0  20%  4320

# Reverse Proxy settings
https_port 10.10.10.10:443 accel cert=/usr/local/etc/squid/599eae0080989.crt key=/usr/local/etc/squid/599eae0080989.key
#
cache_peer tenant.sharepoint.com parent 443 0 originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER

acl allsrc src all
http_access allow all
------------------------------------------------------------------------------

PS: I will keep the DON’T_VERIFY_PEER for testing but I understand this is a security risk.
Thanks.

Regards,
Olivier MARCHETTA


-----Original Message-----
From: squid-users [mailto:[hidden email]] On Behalf Of Amos Jeffries
Sent: Saturday, August 26, 2017 5:21 AM
To: [hidden email]
Subject: Re: [squid-users] Squid Reverse Proxy and WebDAV caching

On 26/08/17 00:49, Olivier MARCHETTA wrote:
> Hello,
>
> Finally Squid is caching my SharePoint online documents.
> But it doesn't work yet.
> If I enable offline mode, the WebDAV client will not be able to download documents from the cache.

That directive was designed for HTTP/1.0 behaviours and only works for objects with optional revalidation. When the server delegates caching freshness decision to the proxy.

When it is applied to content with mandatory revalidation; such as anything with no-cache, private, no-store, must-revalidate directives in
HTTP/1.1 traffic.

The result is that things are prohibited from being delivered AND prohibited from being updated.


> And I will see the following errors in the log:
>
> ---------------------------------------------------------------------------------
> TCP_OFFLINE_HIT_ABORTED/000 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1%20-%20Copy%20-%20Copy%20-%20Copy%20-%20Copy.docx
> TCP_OFFLINE_HIT_ABORTED/000 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1%20-%20Copy%20-%20Copy%20-%20Copy%20-%20Copy.docx
> ----------------------------------------------------------------------
> -----------
>

Squid was simply not able to deliver anything to this client, not even an error message for some reason.

It might be bugs in Squid preventing it generating an error page (ABORTED with 5xx status). But usually ABORTED/000 means the client was the one aborting / disconnecting before any HTTP response at all could be delivered.


> If I disable offline mode, then nothing gets downloaded from the cache.

How are you determining that?

What I can see in the info so far provided is that Squid *is* finding cached content to work with.


>
> I have removed all ACL control from the squid conf (to make it easier for now).
> I have replaced all refresh patterns by customs one (that I've found on Internet from another SharePoint caching project).
>
> Sorry for the long file below, but I am posting my conf file again.
> I don't know why the Squid cache is aborting the cache HIT.

You are forcing Squid to cache things that are marked as non-cacheable
because they contain client-specific security or privacy details. Since
the proxy is unable to determine for itself (on these objects) what
details go to which client caching these things can only be done with
revalidation before HIT delivery.

Then you are also configuring Squid to be forbidden to revalidate
anything at all.


I suspect we have a bug somewhere in Squid that makes it do the
ABORT/000, it should be doing a forced-MISS or a 5xx error with your
config. But that is not what you are needing to happen anyhow, so fixing
that particular bug wont help you.


> If you have any clue, it would be very welcome.
>
>  
> ---------------------------------------------------------------------------------
> http_port 92.222.209.108:3128
> icp_port 0
> digest_generation off
> dns_v4_first on
> pid_filename /var/run/squid/squid.pid
> cache_effective_user squid
> cache_effective_group proxy
> error_default_language en
> icon_directory /usr/local/etc/squid/icons
> visible_hostname sv-1101-wvp01.virtualdesk.cloud
> cache_mgr [hidden email]
> access_log /var/squid/logs/access.log
> cache_log /var/squid/logs/cache.log
> cache_store_log none
> netdb_filename /var/squid/logs/netdb.state
> pinger_enable on
> pinger_program /usr/local/libexec/squid/pinger
>
> logfile_rotate 7
> debug_options rotate=7
> shutdown_lifetime 3 seconds
> # Allow local network(s) on interface(s)
> acl localnet src  92.222.209.0/24
> forwarded_for on
> uri_whitespace strip
>
>
> cache_mem 128 MB
> maximum_object_size_in_memory 512 KB
> memory_replacement_policy heap GDSF
> cache_replacement_policy heap LFUDA
> minimum_object_size 0 KB
> maximum_object_size 20 MB
> cache_dir ufs /var/squid/cache 100 16 256
> offline_mode off
> cache_swap_low 90
> cache_swap_high 95
> cache allow all
>
> # Cache documents regardless what the server says
> refresh_pattern .jpg 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .gif 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .png 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .txt 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .doc 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .docx 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .xls 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .xlsx 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .pdf 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
>


The normal refresh_pattern lines should stay. Just be down here
following your custom ones. At minimum the cgi-bin and '.' patterns are
necessary for correct handling of dynamic content in the cache.

[ Sorry I pressed send by accident earlier before completing that
"Also," statement which was intended to say that. ]


* The ignore-no-cache option was removed from Squid some versions ago.
As I mentioned earlier CC:no-cache actually means things *are* cacheable
in HTTP/1.1, so the directives intended effect is met by current Squids
default behaviour.


* The 50% only means +50% of the objects current age. Which can be very
short for frequently or recently updated objects. Percentages over 100%
are possible here, and usually necessary for good caching times.

* override-lastmod was useful once to avoid bugs (and side-effects from
misconfigured percentages mentioned above). But current Squid can figure
out Last-Modified values from Dates and timestamps as needed. So the
option is rarely necessary and more often than not actually causes worse
caching in by prohibiting Squid from doing heuristic freshness calculations
  YMMV so testing for your specific traffic is needed before use of this
option in current Squid.
  --> and remember how I mentioned offline_mode only works when the
proxy is delegated the freshness calculations? this prohibits Squid from
doing that calculation and uses the admin 14400 minute value instead.


* "reload-into-ims ignore-reload" these two options are mutually
exclusive. Changing a reload header value and ignoring it cannot be done
simultaneously. Pick one:

  ignore-reload - completely ignore the client indication that it needs
the latest data. Note that this is redundant with what offline_mode
does, but far more selective about what URLs it happens for.

  reload-into-ims - ask the server if any changes have happened, so the
cached content can be delivered if none instead of a full re-fetch.


* Since all of these lines are identical except the regex pattern for
URLs they apply to. You would save a lot more CPU cycles by combining
the regex into one pattern and only having one config line for the lot.

  refresh_pattern \.(jpg|gif|png|txt|docx?|xlsx?pdf) 14400 50% 18000 \
    override-expire reload-into-ims ignore-private ignore-auth



* ignore-auth - I would also check the actual response headers from the
server before using this option. While authentication credentials
normally means non-cacheable in HTTP/1.0 traffic in HTTP/1.1 they mean
mandatory revalidation in most cases and sometimes are irrelevant.
  What this option actually does is exclude special handling when auth
headers are present - it actively *prevents* some HTTP/1.1 traffic being
HIT on, when the special conditions were saying auth was cacheable or
irrelevant.


> # Setup acls
> acl allsrc src all
> http_access allow all
>
> request_body_max_size 0 KB
> delay_pools 1
> delay_class 1 2
> delay_parameters 1 -1/-1 -1/-1
> delay_initial_bucket_level 100
> delay_access 1 allow allsrc

These delay_parameters are doing nothing but wasting a surprisingly
large amount of CPU time and memory for calculating traffic numbers and
repeatedly pausing transactions for 0 milliseconds.


>
> # Reverse Proxy settings
> https_port 92.222.209.108:443 accel cert=/usr/local/etc/squid/599eae0080989.crt key=/usr/local/etc/squid/599eae0080989.key
> cache_peer olicomp.sharepoint.com parent 443 0 no-query no-digest originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER front-end-https=auto name=rvp_sharepoint

Avoid DONT_VERIFY_PEER like a plague. Find out the CA(s) which sign the
peer's certs and configure Squid to trust only the right CA for these
peer links, then add the NO_DEFAULT_CA flag. Even if it is one of the
normal global CA.

That will prevent unapproved MITM on your upstream traffic and help
detect traffic loops if the DNS+Squid config gets wonky.


> deny_info TCP_RESET allsrc

This deny_info is explicitly configuring Squid to send a TCP_RESET (aka
ABORTED/000) when ACL "allsrc" is the reason for transaction denial.

With your access control rules removed it should not be having an
effect, but beware of the above when you reinstate those rules.

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Olivier MARCHETTA
Hello,

Sorry for posting fast.
But if I have done another test using Internet Explorer to download the files instead of WebDAV.
And now I will see the cache Hits raising up to 100% in the memory.

-------------------------------------------------------------------
Cache information for squid:
Hits as % of all requests: 5min: 17.7%, 60min: 6.2%
Hits as % of bytes sent: 5min: 4.5%, 60min: 0.2%
Memory hits as % of hit requests: 5min: 100.0%, 60min: 100.0%
-------------------------------------------------------------------

So, does it mean that the built-in WebDAV client is not working with Squid ?
Is there any workaround for this ?
Thanks.

Regards,
Olivier MARCHETTA

-----Original Message-----
From: squid-users [mailto:[hidden email]] On Behalf Of Olivier MARCHETTA
Sent: Tuesday, August 29, 2017 4:53 PM
To: Amos Jeffries <[hidden email]>; [hidden email]
Subject: Re: [squid-users] Squid Reverse Proxy and WebDAV caching

Hello again,

I have quickly setup a Squid version 3.5.26 on Windows and with a minimalist config file:

-------------------------------------------------------------------
acl allsrc src all
http_access allow allsrc
http_port 3128
cache_dir ufs /cygdrive/c/squidcache 100 16 256 coredump_dir /var/cache/squid refresh_pattern \.(jpg|gif|png|txt|docx|xlsx|pdf) 60 90% 600 override-expire reload-into-ims ignore-private
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern . 0 20% 4320
# Reverse Proxy settings
https_port 10.10.10.10:443 accel defaultsite=olicomp.sharepoint.com cert=/cygdrive/c/squidssl/sharepoint.com.crt key=/cygdrive/c/squidssl/sharepoint.com.key
cache_peer 123.123.123.123 parent 443 0 originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER
-------------------------------------------------------------------

Same result. I can see the cache disk folder being filled up.
But the HIT will never happen.
Latency stay high, speed stay low on document access / file transfer.
Can Microsoft SharePoint Servers in Office 365 block any caching attempt ?
Also, I am using a WebDAV client. Is WebDAV supported ?

-------------------------------------------------------------------
Cache information for squid:
Hits as % of all requests: 5min: 0.0%, 60min: 0.0%
Hits as % of bytes sent: 5min: 0.0%, 60min: 0.0%
Memory hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
-------------------------------------------------------------------

Regards,
Olivier MARCHETTA


-----Original Message-----
From: Olivier MARCHETTA
Sent: Tuesday, August 29, 2017 2:47 PM
To: 'Amos Jeffries' <[hidden email]>; [hidden email]
Subject: RE: [squid-users] Squid Reverse Proxy and WebDAV caching

Hello Amos,

Thank you for your answer.
I have applied the configuration updates you recommended.
My squid config file is more simple now.
But unfortunately, I can see the cache filling itselt, but not being hit.

Here's the internal manager info log:
------------------------------------------------------------------------------
Cache information for squid:
        Hits as % of all requests: 5min: 0.0%, 60min: 0.0%
        Hits as % of bytes sent: 5min: 0.4%, 60min: 0.4%
        Memory hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
        Disk hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
        Storage Swap size: 70752 KB
        Storage Swap capacity: 69.1% used, 30.9% free
        Storage Mem size: 216 KB
        Storage Mem capacity: 0.2% used, 99.8% free
        Mean Object Size: 1768.80 KB
        Requests given to unlinkd: 56
------------------------------------------------------------------------------

As you can see, the cache swap capacity has been filled at 70%.
But when I try to get the same content from another computer, the transfer is slow and the HIT percentage stay at 0.

I have noticed the following errors in the Cache.log:
Could not parse headers from on disk object

I don't really know what to do next. I do not understand what this error means.

My squid.conf file below:

------------------------------------------------------------------------------
# This file is automatically generated by pfSense # Do not edit manually !

http_port 10.10.10.10.108:3128
icp_port 0
digest_generation off
dns_v4_first on
pid_filename /var/run/squid/squid.pid
cache_effective_user squid
cache_effective_group proxy
error_default_language en
icon_directory /usr/local/etc/squid/icons visible_hostname pfSense Firewall cache_mgr [hidden email] access_log /var/squid/logs/access.log cache_log /var/squid/logs/cache.log cache_store_log none netdb_filename /var/squid/logs/netdb.state pinger_enable on pinger_program /usr/local/libexec/squid/pinger

logfile_rotate 7
debug_options rotate=7
shutdown_lifetime 3 seconds
forwarded_for on
uri_whitespace strip

cache_mem 128 MB
maximum_object_size_in_memory 256 KB
memory_replacement_policy heap GDSF
cache_replacement_policy heap LFUDA
minimum_object_size 0 KB
maximum_object_size 20 MB
cache_dir ufs /var/squid/cache 100 16 256 offline_mode off cache_swap_low 90 cache_swap_high 95 cache allow all

refresh_pattern \.(jpg|gif|png|txt|docx|xlsx|pdf) 60 90% 600 override-expire reload-into-ims ignore-private

# Add any of your own refresh_pattern entries above these.
refresh_pattern ^ftp:    1440  20%  10080
refresh_pattern ^gopher:  1440  0%  1440 refresh_pattern -i (/cgi-bin/|\?) 0  0%  0
refresh_pattern .    0  20%  4320

# Reverse Proxy settings
https_port 10.10.10.10:443 accel cert=/usr/local/etc/squid/599eae0080989.crt key=/usr/local/etc/squid/599eae0080989.key
#
cache_peer tenant.sharepoint.com parent 443 0 originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER

acl allsrc src all
http_access allow all
------------------------------------------------------------------------------

PS: I will keep the DON’T_VERIFY_PEER for testing but I understand this is a security risk.
Thanks.

Regards,
Olivier MARCHETTA


-----Original Message-----
From: squid-users [mailto:[hidden email]] On Behalf Of Amos Jeffries
Sent: Saturday, August 26, 2017 5:21 AM
To: [hidden email]
Subject: Re: [squid-users] Squid Reverse Proxy and WebDAV caching

On 26/08/17 00:49, Olivier MARCHETTA wrote:
> Hello,
>
> Finally Squid is caching my SharePoint online documents.
> But it doesn't work yet.
> If I enable offline mode, the WebDAV client will not be able to download documents from the cache.

That directive was designed for HTTP/1.0 behaviours and only works for objects with optional revalidation. When the server delegates caching freshness decision to the proxy.

When it is applied to content with mandatory revalidation; such as anything with no-cache, private, no-store, must-revalidate directives in
HTTP/1.1 traffic.

The result is that things are prohibited from being delivered AND prohibited from being updated.


> And I will see the following errors in the log:
>
> ---------------------------------------------------------------------------------
> TCP_OFFLINE_HIT_ABORTED/000 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1%20-%20Copy%20-%20Copy%20-%20Copy%20-%20Copy.docx
> TCP_OFFLINE_HIT_ABORTED/000 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1%20-%20Copy%20-%20Copy%20-%20Copy%20-%20Copy.docx
> ----------------------------------------------------------------------
> -----------
>

Squid was simply not able to deliver anything to this client, not even an error message for some reason.

It might be bugs in Squid preventing it generating an error page (ABORTED with 5xx status). But usually ABORTED/000 means the client was the one aborting / disconnecting before any HTTP response at all could be delivered.


> If I disable offline mode, then nothing gets downloaded from the cache.

How are you determining that?

What I can see in the info so far provided is that Squid *is* finding cached content to work with.


>
> I have removed all ACL control from the squid conf (to make it easier for now).
> I have replaced all refresh patterns by customs one (that I've found on Internet from another SharePoint caching project).
>
> Sorry for the long file below, but I am posting my conf file again.
> I don't know why the Squid cache is aborting the cache HIT.

You are forcing Squid to cache things that are marked as non-cacheable because they contain client-specific security or privacy details. Since the proxy is unable to determine for itself (on these objects) what details go to which client caching these things can only be done with revalidation before HIT delivery.

Then you are also configuring Squid to be forbidden to revalidate anything at all.


I suspect we have a bug somewhere in Squid that makes it do the
ABORT/000, it should be doing a forced-MISS or a 5xx error with your
config. But that is not what you are needing to happen anyhow, so fixing
that particular bug wont help you.


> If you have any clue, it would be very welcome.
>
>  
> ---------------------------------------------------------------------------------
> http_port 92.222.209.108:3128
> icp_port 0
> digest_generation off
> dns_v4_first on
> pid_filename /var/run/squid/squid.pid
> cache_effective_user squid
> cache_effective_group proxy
> error_default_language en
> icon_directory /usr/local/etc/squid/icons
> visible_hostname sv-1101-wvp01.virtualdesk.cloud
> cache_mgr [hidden email]
> access_log /var/squid/logs/access.log
> cache_log /var/squid/logs/cache.log
> cache_store_log none
> netdb_filename /var/squid/logs/netdb.state
> pinger_enable on
> pinger_program /usr/local/libexec/squid/pinger
>
> logfile_rotate 7
> debug_options rotate=7
> shutdown_lifetime 3 seconds
> # Allow local network(s) on interface(s)
> acl localnet src  92.222.209.0/24
> forwarded_for on
> uri_whitespace strip
>
>
> cache_mem 128 MB
> maximum_object_size_in_memory 512 KB
> memory_replacement_policy heap GDSF
> cache_replacement_policy heap LFUDA
> minimum_object_size 0 KB
> maximum_object_size 20 MB
> cache_dir ufs /var/squid/cache 100 16 256
> offline_mode off
> cache_swap_low 90
> cache_swap_high 95
> cache allow all
>
> # Cache documents regardless what the server says
> refresh_pattern .jpg 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .gif 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .png 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .txt 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .doc 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .docx 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .xls 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .xlsx 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
> refresh_pattern .pdf 14400 50% 18000 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-cache ignore-private ignore-auth
>


The normal refresh_pattern lines should stay. Just be down here
following your custom ones. At minimum the cgi-bin and '.' patterns are
necessary for correct handling of dynamic content in the cache.

[ Sorry I pressed send by accident earlier before completing that
"Also," statement which was intended to say that. ]


* The ignore-no-cache option was removed from Squid some versions ago.
As I mentioned earlier CC:no-cache actually means things *are* cacheable
in HTTP/1.1, so the directives intended effect is met by current Squids
default behaviour.


* The 50% only means +50% of the objects current age. Which can be very
short for frequently or recently updated objects. Percentages over 100%
are possible here, and usually necessary for good caching times.

* override-lastmod was useful once to avoid bugs (and side-effects from
misconfigured percentages mentioned above). But current Squid can figure
out Last-Modified values from Dates and timestamps as needed. So the
option is rarely necessary and more often than not actually causes worse
caching in by prohibiting Squid from doing heuristic freshness calculations
  YMMV so testing for your specific traffic is needed before use of this
option in current Squid.
  --> and remember how I mentioned offline_mode only works when the
proxy is delegated the freshness calculations? this prohibits Squid from
doing that calculation and uses the admin 14400 minute value instead.


* "reload-into-ims ignore-reload" these two options are mutually
exclusive. Changing a reload header value and ignoring it cannot be done
simultaneously. Pick one:

  ignore-reload - completely ignore the client indication that it needs
the latest data. Note that this is redundant with what offline_mode
does, but far more selective about what URLs it happens for.

  reload-into-ims - ask the server if any changes have happened, so the
cached content can be delivered if none instead of a full re-fetch.


* Since all of these lines are identical except the regex pattern for
URLs they apply to. You would save a lot more CPU cycles by combining
the regex into one pattern and only having one config line for the lot.

  refresh_pattern \.(jpg|gif|png|txt|docx?|xlsx?pdf) 14400 50% 18000 \
    override-expire reload-into-ims ignore-private ignore-auth



* ignore-auth - I would also check the actual response headers from the
server before using this option. While authentication credentials
normally means non-cacheable in HTTP/1.0 traffic in HTTP/1.1 they mean
mandatory revalidation in most cases and sometimes are irrelevant.
  What this option actually does is exclude special handling when auth
headers are present - it actively *prevents* some HTTP/1.1 traffic being
HIT on, when the special conditions were saying auth was cacheable or
irrelevant.


> # Setup acls
> acl allsrc src all
> http_access allow all
>
> request_body_max_size 0 KB
> delay_pools 1
> delay_class 1 2
> delay_parameters 1 -1/-1 -1/-1
> delay_initial_bucket_level 100
> delay_access 1 allow allsrc

These delay_parameters are doing nothing but wasting a surprisingly
large amount of CPU time and memory for calculating traffic numbers and
repeatedly pausing transactions for 0 milliseconds.


>
> # Reverse Proxy settings
> https_port 92.222.209.108:443 accel cert=/usr/local/etc/squid/599eae0080989.crt key=/usr/local/etc/squid/599eae0080989.key
> cache_peer olicomp.sharepoint.com parent 443 0 no-query no-digest originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER front-end-https=auto name=rvp_sharepoint

Avoid DONT_VERIFY_PEER like a plague. Find out the CA(s) which sign the
peer's certs and configure Squid to trust only the right CA for these
peer links, then add the NO_DEFAULT_CA flag. Even if it is one of the
normal global CA.

That will prevent unapproved MITM on your upstream traffic and help
detect traffic loops if the DNS+Squid config gets wonky.


> deny_info TCP_RESET allsrc

This deny_info is explicitly configuring Squid to send a TCP_RESET (aka
ABORTED/000) when ACL "allsrc" is the reason for transaction denial.

With your access control rules removed it should not be having an
effect, but beware of the above when you reinstate those rules.

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Amos Jeffries
Administrator
On 30/08/17 04:02, Olivier MARCHETTA wrote:
> Hello,
>
> Sorry for posting fast.
> But if I have done another test using Internet Explorer to download the files instead of WebDAV.
> And now I will see the cache Hits raising up to 100% in the memory.

Yay.

>
> -------------------------------------------------------------------
> Cache information for squid:
> Hits as % of all requests: 5min: 17.7%, 60min: 6.2%
> Hits as % of bytes sent: 5min: 4.5%, 60min: 0.2%
> Memory hits as % of hit requests: 5min: 100.0%, 60min: 100.0%
> -------------------------------------------------------------------
>
> So, does it mean that the built-in WebDAV client is not working with Squid ?

It means the server responses were not he cause of the MISS-ing.

To answer your earlier question, yes there are things servers can do to
prevent caching no matter what Squid does. This IE result is evidence
that is not happening.



> Is there any workaround for this ?

Insufficient data right now.

You now need to configure "debug_options 11,2" in your squid.conf and
see what is different in the HTTP client requests from IE versus the
built-in WebDAV. What you find there will determine what workarounds are
needed and/or possible.


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Olivier MARCHETTA
Hello Amos,

This morning, for some reasons, I can't reproduce the Hits in the memory.
Squid is only routed for tenant.sharepoint.com so I don't know what I was Hitting yesterday.

But I have collected extended info.
I repeatedly loaded the same .jpg file several times.
Always a Miss (high latency to access the file).
Information below:

Logs from the client (by Fiddler)
-------------------------------------------------------------------------
GET /sites/Marketing/Shared%20Documents/test_img_1.jpg HTTP/1.1
Cache-Control: no-cache
Connection: Keep-Alive
Pragma: no-cache
User-Agent: Microsoft-WebDAV-MiniRedir/10.0.14393
translate: f
Host: tenant.sharepoint.com
Cookie: FedAuth=*

HTTP/1.1 200 OK
Cache-Control: private,max-age=0
Content-Length: 1708509
Content-Type: image/jpeg
Expires: Tue, 15 Aug 2017 09:53:27 GMT
Last-Modified: Tue, 29 Aug 2017 14:06:14 GMT
Accept-Ranges: bytes
ETag: "{852B897C-67C8-4620-AC40-53FB915EB62D},7"
P3P: CP="ALL IND DSP COR ADM CONo CUR CUSo IVAo IVDo PSA PSD TAI TELo OUR SAMo CNT COM INT NAV ONL PHY PRE PUR UNI"
Set-Cookie: rtFa=*; domain=sharepoint.com; path=/; secure; HttpOnly
Set-Cookie: FedAuth=*; path=/; secure; HttpOnly
Set-Cookie: rtFa=*; domain=sharepoint.com; path=/; secure; HttpOnly
Set-Cookie: FedAuth=*; path=/; secure; HttpOnly
X-SharePointHealthScore: 0
ResourceTag: rt:852B897C-67C8-4620-AC40-53FB915EB62D@00000000007
Public-Extension: http://schemas.microsoft.com/repl-2
SPRequestGuid: c24b149e-009e-4000-7792-b7ee64e3b853
request-id: c24b149e-009e-4000-7792-b7ee64e3b853
Strict-Transport-Security: max-age=31536000
X-FRAME-OPTIONS: SAMEORIGIN
SPRequestDuration: 320
SPIisLatency: 6
X-Powered-By: ASP.NET
MicrosoftSharePointTeamServices: 16.0.0.6823
X-Content-Type-Options: nosniff
X-MS-InvokeApp: 1; RequireReadOnly
X-MSEdge-Ref: Ref A: 5BF5463E978A400CB13978F6A380BACF Ref B: AMS04EDGE0816 Ref C: 2017-08-30T09:53:27Z
Date: Wed, 30 Aug 2017 09:53:26 GMT
X-Cache: MISS from squidserver.local
Via: 1.1 squidserver.local (squid/3.5.26)
Connection: keep-alive
-------------------------------------------------------------------------


Log from the Squid server - cache.log
-------------------------------------------------------------------------
----------
2017/08/30 10:53:10.401 kid1| 11,2| http.cc(2230) sendRequest: HTTP Server local=123.123.123.123:54129 remote=13.107.6.151:443 FD 22 flags=1
2017/08/30 10:53:10.401 kid1| 11,2| http.cc(2231) sendRequest: HTTP Server REQUEST:
----------
GET /sites/Marketing/Shared%20Documents/test_img_1.jpg HTTP/1.1
Pragma: no-cache
User-Agent: Microsoft-WebDAV-MiniRedir/10.0.14393
Translate: f
Cookie: FedAuth=*
Host: tenant.sharepoint.com
Via: 1.1 sharepoint.virtualdesk.cloud (squid/3.5.26)
Surrogate-Capability: sharepoint.virtualdesk.cloud="Surrogate/1.0"
X-Forwarded-For: 92.222.48.79
Cache-Control: no-cache
Connection: keep-alive


----------
2017/08/30 10:53:10.765 kid1| ctx: enter level  0: 'https://olicomp.sharepoint.com/sites/Marketing/Shared%20Documents/test_img_1.jpg'
2017/08/30 10:53:10.765 kid1| 11,2| http.cc(735) processReplyHeader: HTTP Server local=123.123.123.123:54129 remote=13.107.6.151:443 FD 22 flags=1
2017/08/30 10:53:10.765 kid1| 11,2| http.cc(736) processReplyHeader: HTTP Server REPLY:
---------
HTTP/1.1 200 OK
Cache-Control: private,max-age=0
Content-Length: 1708509
Content-Type: image/jpeg
Expires: Tue, 15 Aug 2017 09:53:27 GMT
Last-Modified: Tue, 29 Aug 2017 14:06:14 GMT
Accept-Ranges: bytes
ETag: "{852B897C-67C8-4620-AC40-53FB915EB62D},7"
P3P: CP="ALL IND DSP COR ADM CONo CUR CUSo IVAo IVDo PSA PSD TAI TELo OUR SAMo CNT COM INT NAV ONL PHY PRE PUR UNI"
Set-Cookie: rtFa=*; domain=sharepoint.com; path=/; secure; HttpOnly
Set-Cookie: FedAuth=*=; path=/; secure; HttpOnly
Set-Cookie: rtFa=*; domain=sharepoint.com; path=/; secure; HttpOnly
Set-Cookie: FedAuth=*=; path=/; secure; HttpOnly
X-SharePointHealthScore: 0
ResourceTag: rt:852B897C-67C8-4620-AC40-53FB915EB62D@00000000007
Public-Extension: http://schemas.microsoft.com/repl-2
SPRequestGuid: c24b149e-009e-4000-7792-b7ee64e3b853
request-id: c24b149e-009e-4000-7792-b7ee64e3b853
Strict-Transport-Security: max-age=31536000
X-FRAME-OPTIONS: SAMEORIGIN
SPRequestDuration: 320
SPIisLatency: 6
X-Powered-By: ASP.NET
MicrosoftSharePointTeamServices: 16.0.0.6823
X-Content-Type-Options: nosniff
X-MS-InvokeApp: 1; RequireReadOnly
X-MSEdge-Ref: Ref A: 5BF5463E978A400CB13978F6A380BACF Ref B: AMS04EDGE0816 Ref C: 2017-08-30T09:53:27Z
Date: Wed, 30 Aug 2017 09:53:26 GMT
-------------------------------------------------------------------------

Log from the Squid server - access.log
-------------------------------------------------------------------------
TCP_MISS/200 1712549 GET https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/test_img_1.jpg - FIRSTUP_PARENT/13.107.6.151 image/jpeg


Any help for the analysis would be welcome.


Regards,
Olivier MARCHETTA
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Amos Jeffries
Administrator
On 30/08/17 22:17, Olivier MARCHETTA wrote:

> Hello Amos,
>
> This morning, for some reasons, I can't reproduce the Hits in the memory.
> Squid is only routed for tenant.sharepoint.com so I don't know what I was Hitting yesterday.
>
> But I have collected extended info.
> I repeatedly loaded the same .jpg file several times.
> Always a Miss (high latency to access the file).
> Information below:
>
> Logs from the client (by Fiddler)
> -------------------------------------------------------------------------
> GET /sites/Marketing/Shared%20Documents/test_img_1.jpg HTTP/1.1
> Cache-Control: no-cache

The above as a request header forbids cached content being delivered.

You will need either the reload-into-ims option on refresh_pattern, or
the ignore-cc option on your http_port line (use only as a last resort).

> Connection: Keep-Alive
> Pragma: no-cache
> User-Agent: Microsoft-WebDAV-MiniRedir/10.0.14393
> translate: f
> Host: tenant.sharepoint.com
> Cookie: FedAuth=*
>
> HTTP/1.1 200 OK
> Cache-Control: private,max-age=0
> Content-Length: 1708509
> Content-Type: image/jpeg
> Expires: Tue, 15 Aug 2017 09:53:27 GMT
> Last-Modified: Tue, 29 Aug 2017 14:06:14 GMT

Strange content, it expires 2 weeks before it was created/modified.


> Accept-Ranges: bytes
> ETag: "{852B897C-67C8-4620-AC40-53FB915EB62D},7"
<snip>
> Date: Wed, 30 Aug 2017 09:53:26 GMT
> X-Cache: MISS from squidserver.local
> Via: 1.1 squidserver.local (squid/3.5.26)
> Connection: keep-alive
> -------------------------------------------------------------------------
>

Anyhow, this response is stale on delivery (max-age=0 and Expires
timestamp older than Date timestamp). So to cache it you will also need
the "store-stale" option on your matching refresh_pattern line.

You might also want to setup a limit on staleness with max_stale global
directive, or max-stale=N refresh_pattern option.



Amos

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Olivier MARCHETTA
Hello,

I've made many test, but it seems not wanting to deliver from the cache.
I think the objects are in the cache, I have modified the cache in memory object size.
And now I can see the memory being filled up as I transfer / GET the files from SharePoint Online / Office 365.

Do you think that any configuration change would work ?
I was thinking about rewriting URLs upfront, before the Squid Cache proxy, in a chain configuration.
But I am trying to avoid it for now.

My Squid config:
-------------------------------------------------------------------
acl allsrc src all
http_access allow allsrc
#
http_port 3128
#
cache_dir ufs /cygdrive/c/squidcache 100 16 256
#
cache_mem 128 MB
minimum_object_size 0 bytes
maximum_object_size 50 MB
maximum_object_size_in_memory 10 MB
max_stale 1 month
#
coredump_dir /var/cache/squid
#
debug_options 11,2
#
refresh_pattern -i \.(jpg|gif|png|txt|docx|xlsx|pdf) 30240 100% 43800 override-expire ignore-private ignore-reload store-stale
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern . 0 20% 4320
#
https_port 10.10.10.10:443 accel ignore-cc defaultsite=tenant.sharepoint.com cert=/cygdrive/c/squidssl/sharepoint.com.crt key=/cygdrive/c/squidssl/sharepoint.com.key
#
cache_peer 13.107.6.151 parent 443 0 originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER
-------------------------------------------------------------------

Regards,
Olivier MARCHETTA


_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Amos Jeffries
Administrator
On 31/08/17 03:35, Olivier MARCHETTA wrote:
> Hello,
>
> I've made many test, but it seems not wanting to deliver from the cache.
> I think the objects are in the cache, I have modified the cache in memory object size.
> And now I can see the memory being filled up as I transfer / GET the files from SharePoint Online / Office 365.
>
> Do you think that any configuration change would work ?

What you have now should be caching the responses like the one in your
previous mail, AND serving them to clients.

I can only guess that something is wrong with your tests. Or that the
previous mails transaction is not actually a typical object.



> I was thinking about rewriting URLs upfront, before the Squid Cache proxy, in a chain configuration.
> But I am trying to avoid it for now.

It would not help. The URL is just part of the hash key for caching. The
other HTTP mechanisms are the things causing HIT vs MISS vs REFRESH
behavior and you already have configured to override those.


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid Reverse Proxy and WebDAV caching

Olivier MARCHETTA
Hi Amos,

It works now.
I made a proper test between 2 clients and using Robocopy.
Here's the cache HIT result:
-----------------------------------------------------------------
Cache information for squid:
Hits as % of all requests: 5min: 15.2%, 60min: 14.4%
Hits as % of bytes sent: 5min: 67.4%, 60min: 67.4%
Memory hits as % of hit requests: 5min: 100.0%, 60min: 100.0%
Disk hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
Storage Swap size: 70752 KB
Storage Swap capacity: 69.1% used, 30.9% free
Storage Mem size: 70968 KB
Storage Mem capacity: 54.1% used, 45.9% free
Mean Object Size: 1768.80 KB
Requests given to unlinkd: 0
-----------------------------------------------------------------

Robocopy log from WORKSTATION1
Files :   40
Copied: 40
Time :   00:48  

Robocopy log from WORKSTATION2
Files :   40
Copied: 40
Time :   00:14  

If I clear the WebDAV client cache on WORKSTATION1 and execute the copy test again, I will also download from the cache.
The overall copy time will be below 15 seconds instead of 50 seconds.
I don't have any error if I try to read a file from the cache (as I had before)
and the copied files are healthy.

Great !
I will wait before saying it's a victory.
But at least we can now read files from the Squid Cache.
Which was the most important step before going any forward. 😊

Thank you very much for your help and accurate answers.

Regards,
Olivier MARCHETTA


-----Original Message-----
From: Amos Jeffries [mailto:[hidden email]]
Sent: Wednesday, August 30, 2017 4:56 PM
To: Olivier MARCHETTA <[hidden email]>; [hidden email]
Subject: Re: [squid-users] Squid Reverse Proxy and WebDAV caching

On 31/08/17 03:35, Olivier MARCHETTA wrote:
> Hello,
>
> I've made many test, but it seems not wanting to deliver from the cache.
> I think the objects are in the cache, I have modified the cache in memory object size.
> And now I can see the memory being filled up as I transfer / GET the files from SharePoint Online / Office 365.
>
> Do you think that any configuration change would work ?

What you have now should be caching the responses like the one in your previous mail, AND serving them to clients.

I can only guess that something is wrong with your tests. Or that the previous mails transaction is not actually a typical object.



> I was thinking about rewriting URLs upfront, before the Squid Cache proxy, in a chain configuration.
> But I am trying to avoid it for now.

It would not help. The URL is just part of the hash key for caching. The
other HTTP mechanisms are the things causing HIT vs MISS vs REFRESH
behavior and you already have configured to override those.


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users