Data tricking implementation is on ICAP side or Squid side?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Data tricking implementation is on ICAP side or Squid side?

jzhu

Hi, All,

 

We have similar project idea of content scanning which is built on Squid and ICAP. While ICAP content scanner takes very long time for  large file (> 10MB) scanning, the session to client browser will be disconnected.  As people suggested in this community, for better user experience, we would like to add the “data trickling” feature – slow the transmit speed to client while waiting for ICAP scanning completion.

 

I implemented ICAP in java. I have questions regarding the “data trickling” to handle slow response for large file scanning from ICAP.  Could you please give your advices?  

1) Java libraries available for data trickling at ICAP side, if any?

2) Need any configuration change for trickling feature on Squid side?

3) Need any code change on Squid side?

3) The trickling (in a very slow speed data send to Squid) is implemented only ICAP server side, correct?

 

Thank you all,

 

John Zhu

 


_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Data tricking implementation is on ICAP side or Squid side?

Alex Rousskov
On 1/20/21 3:21 PM, John Zhu wrote:

> I implemented ICAP in java. I have questions regarding the “data
> trickling” to handle slow response for large file scanning from ICAP.

> 1) Java libraries available for data trickling at ICAP side, if any?

FWIW, implementing a production ICAP server from scratch (in any
language) is usually a bad idea -- there are too many poorly documented
and barely understood protocol areas. There are production ICAP servers
that support data trickling.

Unfortunately, I am not familiar with Java libraries, but there were two
ICAP projects in Java:
https://wiki.squid-cache.org/Features/ICAP#ICAP_Servers


> 2) Need any configuration change for trickling feature on Squid side?
> 3) Need any code change on Squid side?

I do not recall any required changes. Squid itself can be unaware that
data trickling is going on. However, it is possible that, in some
extreme cases (e.g., trickling one byte at a time), some configuration
or code adjustments would be needed to force "flushing" of that data
through Squid or to fix Squid metadata parsing bugs.


> 3) The trickling (in a very slow speed data send to Squid) is
> implemented only ICAP server side, correct?

The ICAP client side must not buffer (as in "delay to aggregate") data
trickled by the ICAP server.


HTH,

Alex.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Data tricking implementation is on ICAP side or Squid side?

jzhu
In reply to this post by jzhu

Hi, All,

 

I have a wired issue. I setup the  Squid and ICAP.   When ICAP (in RespMod) sends response body (any file types, most of time are large size files) in a relative slow speed to squid,  if the time elapses longer than 1 minute, the browser will close the session and fail the downloading, the squid log shows the error of TCP_MISS_ABORTED/206

 

Here are the configuration.  I am new to squid.  

 

==> /usr/local/squid/var/logs/access.log <==

1613593651.769  59962 172.90.1.1 TCP_MISS_ABORTED/206 3635 GET https://pfpt-my.sharepoint.com/personal/jzhu_company _com/_layouts/15/download.aspx? - HIER_DIRECT/13.107.136.9 application/pdf

 

 

acl SSL_ports port 443
acl Safe_ports port 80    # http
acl Safe_ports port 21    # ftp
acl Safe_ports port 443       # https
acl Safe_ports port 70    # gopher
acl Safe_ports port 210       # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280       # http-mgmt
acl Safe_ports port 488       # gss-http
acl Safe_ports port 591       # filemaker
acl Safe_ports port 777       # multiling http
acl CONNECT method CONNECT
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localhost manager
http_access deny manager
http_access allow all

# This is to help with the development process only
#cache deny all


cache_mem 1024 MB
maximum_object_size 200 MB
cache_swap_low 90
cache_swap_high 95
quick_abort_min -1

refresh_pattern ^ftp:     1440   20%    10080
refresh_pattern ^gopher:   1440   0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0    0% 0
refresh_pattern (Release|Packages(.gz)*)$      0       20%     2880
refresh_pattern .     0  20%    4320

# Docker-compose setup
icap_enable on
icap_io_timeout 600 seconds
icap_connect_timeout 600 seconds

icap_service service_req reqmod_precache bypass=1 icap://icapserver:1344/req
icap_service service_resp respmod_precache bypass=1 icap://icapserver:1344/resp
adaptation_access service_req allow all
adaptation_access service_resp allow all

 

 

Thank you all,

 

John Zhu

 


_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Data tricking implementation is on ICAP side or Squid side?

Alex Rousskov
On 2/18/21 12:36 AM, John Zhu wrote:

> I have a wired issue. I setup the  Squid and ICAP.   When ICAP (in
> RespMod) sends response body (any file types, most of time are large
> size files) in a relative slow speed to squid,  if the time elapses
> longer than 1 minute, the browser will close the session and fail the
> downloading,

During that minute, does the browser actually get any HTTP response
bytes that the ICAP service sent to Squid? Just the response headers?
Nothing at all?


> the squid log shows the error of TCP_MISS_ABORTED/206

HTTP 206 is Partial Content response to a Range request. Did the origin
server respond with an HTTP 206 response? Or did the ICAP server
converted an HTTP 200 response into an HTTP 206 response? Or did Squid
do that conversion?

If Squid does the conversion, then perhaps Squid just has no data to
send to the client because the requested range has not come from the
data trickling ICAP service yet?


HTH,

Alex.



> Here are the configuration.  I am new to squid.  
>
>  
>
> ==> /usr/local/squid/var/logs/access.log <==
>
> 1613593651.769  59962 172.90.1.1 TCP_MISS_ABORTED/206 3635 GET
> https://pfpt-my.sharepoint.com/personal/jzhu_company
> _com/_layouts/15/download.aspx? - HIER_DIRECT/13.107.136.9 application/pdf
>
>  
>
>  
>
> acl SSL_ports port 443
> acl Safe_ports port 80    # http
> acl Safe_ports port 21    # ftp
> acl Safe_ports port 443       # https
> acl Safe_ports port 70    # gopher
> acl Safe_ports port 210       # wais
> acl Safe_ports port 1025-65535 # unregistered ports
> acl Safe_ports port 280       # http-mgmt
> acl Safe_ports port 488       # gss-http
> acl Safe_ports port 591       # filemaker
> acl Safe_ports port 777       # multiling http
> acl CONNECT method CONNECT
> http_access deny !Safe_ports
> http_access deny CONNECT !SSL_ports
> http_access allow localhost manager
> http_access deny manager
> http_access allow all
>
> # This is to help with the development process only
> #cache deny all
>
>
> cache_mem 1024 MB
> maximum_object_size 200 MB
> cache_swap_low 90
> cache_swap_high 95
> quick_abort_min -1
>
> refresh_pattern ^ftp:     1440   20%    10080
> refresh_pattern ^gopher:   1440   0% 1440
> refresh_pattern -i (/cgi-bin/|\?) 0    0% 0
> refresh_pattern (Release|Packages(.gz)*)$      0       20%     2880
> refresh_pattern .     0  20%    4320
>
> # Docker-compose setup
> icap_enable on
> icap_io_timeout 600 seconds
> icap_connect_timeout 600 seconds
>
> icap_service service_req reqmod_precache bypass=1 icap://icapserver:1344/req
> icap_service service_resp respmod_precache bypass=1
> icap://icapserver:1344/resp
> adaptation_access service_req allow all
> adaptation_access service_resp allow all
>
>  
>
>  
>
> Thank you all,
>
>  
>
> John Zhu
>
>  
>
>
> _______________________________________________
> squid-users mailing list
> [hidden email]
> http://lists.squid-cache.org/listinfo/squid-users
>

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Data tricking implementation is on ICAP side or Squid side?

jzhu


On 2/17/21, 10:28 PM, "Alex Rousskov" <[hidden email]> wrote:

    On 2/18/21 12:36 AM, John Zhu wrote:

    > I have a wired issue. I setup the  Squid and ICAP.   When ICAP (in
    > RespMod) sends response body (any file types, most of time are large
    > size files) in a relative slow speed to squid,  if the time elapses
    > longer than 1 minute, the browser will close the session and fail the
    > downloading,

    During that minute, does the browser actually get any HTTP response
    bytes that the ICAP service sent to Squid? Just the response headers?
    Nothing at all?

    --- Yes, I implemented the data trickling feature at ICAP server side. I can see in the Firefox browser.:
1)  prompt to save or open file
2)  the progress bar is receiving a few bytes every seconds  


    > the squid log shows the error of TCP_MISS_ABORTED/206

    HTTP 206 is Partial Content response to a Range request. Did the origin
    server respond with an HTTP 206 response? Or did the ICAP server
    converted an HTTP 200 response into an HTTP 206 response? Or did Squid
    do that conversion?

 -- Yes, sending back to squid from ICAP server 206 partial body data, not the original server.

    If Squid does the conversion, then perhaps Squid just has no data to
    send to the client because the requested range has not come from the
    data trickling ICAP service yet?

--- What is request range?  Does it put in the header of icap response header?  This is the ICAP header back to squid
ICAP/1.0 200 OK



Date: Mon, 15 Feb 2021 04:35:37 +0000


    HTH,

    Alex.



    > Here are the configuration.  I am new to squid.  
    >
    >  
    >
    > ==> /usr/local/squid/var/logs/access.log <==
    >
    > 1613593651.769  59962 172.90.1.1 TCP_MISS_ABORTED/206 3635 GET
    > https://urldefense.com/v3/__https://pfpt-my.sharepoint.com/personal/jzhu_company__;!!ORgEfCBsr282Fw!-GAUSOBJG8F9UUMSLCJJWioLebLx-daFRj1qtCC8n3lXrg-1bD6s1AF2-2wMthz3$ 
    > _com/_layouts/15/download.aspx? - HIER_DIRECT/13.107.136.9 application/pdf
    >
    >  
    >
    >  
    >
    > acl SSL_ports port 443
    > acl Safe_ports port 80    # http
    > acl Safe_ports port 21    # ftp
    > acl Safe_ports port 443       # https
    > acl Safe_ports port 70    # gopher
    > acl Safe_ports port 210       # wais
    > acl Safe_ports port 1025-65535 # unregistered ports
    > acl Safe_ports port 280       # http-mgmt
    > acl Safe_ports port 488       # gss-http
    > acl Safe_ports port 591       # filemaker
    > acl Safe_ports port 777       # multiling http
    > acl CONNECT method CONNECT
    > http_access deny !Safe_ports
    > http_access deny CONNECT !SSL_ports
    > http_access allow localhost manager
    > http_access deny manager
    > http_access allow all
    >
    > # This is to help with the development process only
    > #cache deny all
    >
    >
    > cache_mem 1024 MB
    > maximum_object_size 200 MB
    > cache_swap_low 90
    > cache_swap_high 95
    > quick_abort_min -1
    >
    > refresh_pattern ^ftp:     1440   20%    10080
    > refresh_pattern ^gopher:   1440   0% 1440
    > refresh_pattern -i (/cgi-bin/|\?) 0    0% 0
    > refresh_pattern (Release|Packages(.gz)*)$      0       20%     2880
    > refresh_pattern .     0  20%    4320
    >
    > # Docker-compose setup
    > icap_enable on
    > icap_io_timeout 600 seconds
    > icap_connect_timeout 600 seconds
    >
    > icap_service service_req reqmod_precache bypass=1 icap://icapserver:1344/req
    > icap_service service_resp respmod_precache bypass=1
    > icap://icapserver:1344/resp
    > adaptation_access service_req allow all
    > adaptation_access service_resp allow all
    >
    >  
    >
    >  
    >
    > Thank you all,
    >
    >  
    >
    > John Zhu
    >
    >  
    >
    >
    > _______________________________________________
    > squid-users mailing list
    > [hidden email]
    > https://urldefense.com/v3/__http://lists.squid-cache.org/listinfo/squid-users__;!!ORgEfCBsr282Fw!-GAUSOBJG8F9UUMSLCJJWioLebLx-daFRj1qtCC8n3lXrg-1bD6s1AF2-5jdkNI7$ 
    >


_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Data tricking implementation is on ICAP side or Squid side?

Alex Rousskov
On 2/18/21 1:52 AM, John Zhu wrote:

> On 2/17/21, 10:28 PM, "Alex Rousskov" wrote:
>
>     On 2/18/21 12:36 AM, John Zhu wrote:
>
>     > I have a wired issue. I setup the  Squid and ICAP.   When ICAP (in
>     > RespMod) sends response body (any file types, most of time are large
>     > size files) in a relative slow speed to squid,  if the time elapses
>     > longer than 1 minute, the browser will close the session and fail the
>     > downloading,
>
>     During that minute, does the browser actually get any HTTP response
>     bytes that the ICAP service sent to Squid? Just the response headers?
>     Nothing at all?


>     --- Yes, I implemented the data trickling feature at ICAP server side. I can see in the Firefox browser.:
> 1)  prompt to save or open file
> 2)  the progress bar is receiving a few bytes every seconds  

Please note that my questions were not about the functionality in
general, but the problematic transaction specifically.

If the browser is constantly receiving data during that minute, then it
probably does not timeout. What changes after that minute? You may get
more information from the browser developer console or equivalent. The
browser should log the reason for transaction termination somewhere.


>     > the squid log shows the error of TCP_MISS_ABORTED/206
>
>     HTTP 206 is Partial Content response to a Range request. Did the origin
>     server respond with an HTTP 206 response? Or did the ICAP server
>     converted an HTTP 200 response into an HTTP 206 response? Or did Squid
>     do that conversion?

>  -- Yes, sending back to squid from ICAP server 206 partial body data, not the original server.

Are you sure that the origin server did not send an HTTP 206 response to
Squid? An ICAP server cannot replace an HTTP 200 OK response with an
HTTP 206 response _unless_ the request had a Range header, and even that
theoretically-possible 200->206 rewrite may not be supported by Squid
today (I have not checked that it is supported).


>     If Squid does the conversion, then perhaps Squid just has no data to
>     send to the client because the requested range has not come from the
>     data trickling ICAP service yet?
>
> --- What is request range?  Does it put in the header of icap response header?

To learn about Range requests (and 206 responses to them), please see
RFC 7233: https://tools.ietf.org/html/rfc7233


> This is the ICAP header back to squid
> ICAP/1.0 200 OK

We are talking about HTTP headers, not ICAP headers. What HTTP headers
does your service receive and what HTTP headers does your service send back?


Cheers,

Alex.


> Date: Mon, 15 Feb 2021 04:35:37 +0000
>
>
>     HTH,
>
>     Alex.
>
>
>
>     > Here are the configuration.  I am new to squid.  
>     >
>     >  
>     >
>     > ==> /usr/local/squid/var/logs/access.log <==
>     >
>     > 1613593651.769  59962 172.90.1.1 TCP_MISS_ABORTED/206 3635 GET
>     > https://urldefense.com/v3/__https://pfpt-my.sharepoint.com/personal/jzhu_company__;!!ORgEfCBsr282Fw!-GAUSOBJG8F9UUMSLCJJWioLebLx-daFRj1qtCC8n3lXrg-1bD6s1AF2-2wMthz3$ 
>     > _com/_layouts/15/download.aspx? - HIER_DIRECT/13.107.136.9 application/pdf
>     >
>     >  
>     >
>     >  
>     >
>     > acl SSL_ports port 443
>     > acl Safe_ports port 80    # http
>     > acl Safe_ports port 21    # ftp
>     > acl Safe_ports port 443       # https
>     > acl Safe_ports port 70    # gopher
>     > acl Safe_ports port 210       # wais
>     > acl Safe_ports port 1025-65535 # unregistered ports
>     > acl Safe_ports port 280       # http-mgmt
>     > acl Safe_ports port 488       # gss-http
>     > acl Safe_ports port 591       # filemaker
>     > acl Safe_ports port 777       # multiling http
>     > acl CONNECT method CONNECT
>     > http_access deny !Safe_ports
>     > http_access deny CONNECT !SSL_ports
>     > http_access allow localhost manager
>     > http_access deny manager
>     > http_access allow all
>     >
>     > # This is to help with the development process only
>     > #cache deny all
>     >
>     >
>     > cache_mem 1024 MB
>     > maximum_object_size 200 MB
>     > cache_swap_low 90
>     > cache_swap_high 95
>     > quick_abort_min -1
>     >
>     > refresh_pattern ^ftp:     1440   20%    10080
>     > refresh_pattern ^gopher:   1440   0% 1440
>     > refresh_pattern -i (/cgi-bin/|\?) 0    0% 0
>     > refresh_pattern (Release|Packages(.gz)*)$      0       20%     2880
>     > refresh_pattern .     0  20%    4320
>     >
>     > # Docker-compose setup
>     > icap_enable on
>     > icap_io_timeout 600 seconds
>     > icap_connect_timeout 600 seconds
>     >
>     > icap_service service_req reqmod_precache bypass=1 icap://icapserver:1344/req
>     > icap_service service_resp respmod_precache bypass=1
>     > icap://icapserver:1344/resp
>     > adaptation_access service_req allow all
>     > adaptation_access service_resp allow all
>     >
>     >  
>     >
>     >  
>     >
>     > Thank you all,
>     >
>     >  
>     >
>     > John Zhu
>     >
>     >  
>     >
>     >
>     > _______________________________________________
>     > squid-users mailing list
>     > [hidden email]
>     > https://urldefense.com/v3/__http://lists.squid-cache.org/listinfo/squid-users__;!!ORgEfCBsr282Fw!-GAUSOBJG8F9UUMSLCJJWioLebLx-daFRj1qtCC8n3lXrg-1bD6s1AF2-5jdkNI7$ 
>     >
>
>

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users