Squid not failing over to secondary DNS host

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Squid not failing over to secondary DNS host

Geoffrey
Hello folks,

I am finding that Squid will not use the secondary DNS if the first
one is taken offline. In this case the primary DNS is not able to
respond because I have taken it offline, and therefore the secondary
DNS should be queried by squid, but is not.

I have 2 Windows recursive DNS servers; 192.168.100.249 and
192.168.100.248, that are statically specified in /etc/resolv.conf. I
am authenticating against AD using i) Kerberos and ii) NTLM.

This looks like it is a Squid internal dns client response rather than
operating system. While 192.168.100.249 is offline, all other queries
done by command-line queries work OK which indicates the system is
using the secondary DNS server fine… just not Squid!

What we want to happen of course is that if the primary
(192.168.100.249) is down or it cannot contact root DNS servers, then
it contacts the secondary nameserver specified on the LAN (as per the
configuration in resolv.conf) and resolves the name.

*Squid is SUCCESSFULLY reading resolv.conf as proved in cache.log after reload
*Setting dns resolvers directly in the squid config file with
'dns_nameservers' does not resolve the issue as the symptom is
identical
*modified squid dns timeouts to a low value (less than 10 secs) for
testing but made no difference

Many thanks for any ideas you may have.

Kind regards,
Geoff
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid not failing over to secondary DNS host

Amos Jeffries
Administrator
On 12/10/17 15:04, Geoffrey wrote:
> Hello folks,
>
> I am finding that Squid will not use the secondary DNS if the first
> one is taken offline. In this case the primary DNS is not able to
> respond because I have taken it offline, and therefore the secondary
> DNS should be queried by squid, but is not.
>

How are you determining that exactly?
  squid logs? DNS logs? firewall counters? packet traces?


> I have 2 Windows recursive DNS servers; 192.168.100.249 and
> 192.168.100.248, that are statically specified in /etc/resolv.conf. I
> am authenticating against AD using i) Kerberos and ii) NTLM.
>
> This looks like it is a Squid internal dns client response rather than
> operating system. While 192.168.100.249 is offline, all other queries
> done by command-line queries work OK which indicates the system is
> using the secondary DNS server fine… just not Squid!
>
> What we want to happen of course is that if the primary
> (192.168.100.249) is down or it cannot contact root DNS servers, then
> it contacts the secondary nameserver specified on the LAN (as per the
> configuration in resolv.conf) and resolves the name.
>
> *Squid is SUCCESSFULLY reading resolv.conf as proved in cache.log after reload
> *Setting dns resolvers directly in the squid config file with
> 'dns_nameservers' does not resolve the issue as the symptom is
> identical
> *modified squid dns timeouts to a low value (less than 10 secs) for
> testing but made no difference
>
> Many thanks for any ideas you may have.


What does the cachemgr "idns" report say?


command line:
   squidclient mgr:idns

or URL:
   http://$(visible_hostname):3128/squid-internal-mgr/idns


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid not failing over to secondary DNS host

Geoffrey
Thanks for your reply Amos.

I just realised I left out some info in the original email that was
pertinent. :)


>How are you determining that exactly?
> squid logs? DNS logs? firewall counters? packet traces?

Quite simply by trial and error and monitoring the results of taking
the 2 DNS/DCs offline/online, and using the cachemgr report.

EG. here is the report after I loaded one page and then took the
primary DNS offline, then continued to browse to two more pages. The
latter two pages did not load and the cachemgr report seems to verify
that squid is not using the secondary dns server at all (primary dns
server having 27 queries to 9 replies and the secondary getting none).


root@websafetyv51:~# squidclient mgr:idns
HTTP/1.1 200 OK
Server: squid/3.5.23
Mime-Version: 1.0
Date: Thu, 12 Oct 2017 05:30:12 GMT
Content-Type: text/plain;charset=utf-8
Expires: Thu, 12 Oct 2017 05:30:12 GMT
Last-Modified: Thu, 12 Oct 2017 05:30:12 GMT
X-Cache: MISS from websafetyv51.localdom.local
X-Cache-Lookup: MISS from websafetyv51.localdom.local:3128
Via: 1.1 websafetyv51.localdom.local (squid/3.5.23)
Connection: close

Internal DNS Statistics:

The Queue:
                       DELAY SINCE
  ID   SIZE SENDS FIRST SEND LAST SEND M FQDN
------ ---- ----- ---------- --------- - ----

DNS jumbo-grams: not working

Nameservers:
IP ADDRESS                                     # QUERIES # REPLIES Type
---------------------------------------------- --------- --------- --------
192.168.100.249                                      27         9 recurse
192.168.100.248                                       0         0 recurse

Rcode Matrix:
RCODE ATTEMPT1 ATTEMPT2 ATTEMPT3 PROBLEM
    0     1550        0        0 : Success
    1        0        0        0 : Packet Format Error
    2        0        0        0 : DNS Server Failure
    3        4        0        0 : Non-Existent Domain
    4        0        0        0 : Not Implemented
    5        0        0        0 : Query Refused
    6        0        0        0 : Name Exists when it should not
    7        0        0        0 : RR Set Exists when it should not
    8        0        0        0 : RR Set that should exist does not
    9        0        0        0 : Server Not Authoritative for zone
   10        0        0        0 : Name not contained in zone
   16        0        0        0 : Bad OPT Version or TSIG Signature Failure

Search list:
localdom.local


Squid version: Squid Object Cache: Version 3.5.23
Ubuntu server: "Ubuntu 16.04.3 LTS"

Cheers
Geoffrey



On 12 October 2017 at 15:53, Amos Jeffries <[hidden email]> wrote:

> On 12/10/17 15:04, Geoffrey wrote:
>>
>> Hello folks,
>>
>> I am finding that Squid will not use the secondary DNS if the first
>> one is taken offline. In this case the primary DNS is not able to
>> respond because I have taken it offline, and therefore the secondary
>> DNS should be queried by squid, but is not.
>>
>
> How are you determining that exactly?
>  squid logs? DNS logs? firewall counters? packet traces?
>
>
>> I have 2 Windows recursive DNS servers; 192.168.100.249 and
>> 192.168.100.248, that are statically specified in /etc/resolv.conf. I
>> am authenticating against AD using i) Kerberos and ii) NTLM.
>>
>> This looks like it is a Squid internal dns client response rather than
>> operating system. While 192.168.100.249 is offline, all other queries
>> done by command-line queries work OK which indicates the system is
>> using the secondary DNS server fine… just not Squid!
>>
>> What we want to happen of course is that if the primary
>> (192.168.100.249) is down or it cannot contact root DNS servers, then
>> it contacts the secondary nameserver specified on the LAN (as per the
>> configuration in resolv.conf) and resolves the name.
>>
>> *Squid is SUCCESSFULLY reading resolv.conf as proved in cache.log after
>> reload
>> *Setting dns resolvers directly in the squid config file with
>> 'dns_nameservers' does not resolve the issue as the symptom is
>> identical
>> *modified squid dns timeouts to a low value (less than 10 secs) for
>> testing but made no difference
>>
>> Many thanks for any ideas you may have.
>
>
>
> What does the cachemgr "idns" report say?
>
>
> command line:
>   squidclient mgr:idns
>
> or URL:
>   http://$(visible_hostname):3128/squid-internal-mgr/idns
>
>
> Amos
> _______________________________________________
> squid-users mailing list
> [hidden email]
> http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid not failing over to secondary DNS host

Amos Jeffries
Administrator
On 12/10/17 18:44, Geoffrey wrote:

> Thanks for your reply Amos.
>
> I just realised I left out some info in the original email that was
> pertinent. :)
>
>
>> How are you determining that exactly?
>> squid logs? DNS logs? firewall counters? packet traces?
>
> Quite simply by trial and error and monitoring the results of taking
> the 2 DNS/DCs offline/online, and using the cachemgr report.
>
> EG. here is the report after I loaded one page and then took the
> primary DNS offline, then continued to browse to two more pages. The
> latter two pages did not load and the cachemgr report seems to verify
> that squid is not using the secondary dns server at all (primary dns
> server having 27 queries to 9 replies and the secondary getting none).
>
>
> root@websafetyv51:~# squidclient mgr:idns
> Date: Thu, 12 Oct 2017 05:30:12 GMT
 > Via: 1.1 websafetyv51.localdom.local (squid/3.5.23)
...

>
> Internal DNS Statistics:
>
> The Queue:
>                         DELAY SINCE
>    ID   SIZE SENDS FIRST SEND LAST SEND M FQDN
> ------ ---- ----- ---------- --------- - ----
>
> DNS jumbo-grams: not working
>
> Nameservers:
> IP ADDRESS                                     # QUERIES # REPLIES Type
> ---------------------------------------------- --------- --------- --------
> 192.168.100.249                                      27         9 recurse
> 192.168.100.248                                       0         0 recurse
>
> Rcode Matrix:
> RCODE ATTEMPT1 ATTEMPT2 ATTEMPT3 PROBLEM
>      0     1550        0        0 : Success
...

That is a bit odd. Also the fact that ~1550 queries are not showing up
in the nameserver counters.

Do you have ICMP and ICMPv6 working in your network? If not that is
probably part of the issue.

Are you using DROP rules or policies in your firewalls? that can also
lead to missing packets like this.

Are you able to perform some more careful tests?
  * restart Squid with both resolvers active and take snapshots of that
report periodically across the test. It will need sufficient time after
shutting down the first resolver for any packet or query TTLs to expire.


If you could also check whether either resolver is responding using
alternative IP addresses it would help clarify what is going on.


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid not failing over to secondary DNS host

Amos Jeffries
Administrator
On 16/10/17 20:24, Geoffrey wrote:

> Hello Amos
>
>> Do you have ICMP and ICMPv6 working in your network? If not that is probably part of the issue.
>
> AND
>
>> Are you using DROP rules or policies in your firewalls? that can also lead to missing packets like this.
>
> You may be getting warm. I have IPv6 disabled on the proxy server
> (kernel), but more interestingly I notice that the Windows System
> Admin has a bunch of ICMP ingress block rules on the Windows DNS
> servers.
>
> What ICMP does Squid (or is it the pinger involved?) require for DNS
> to failover. I will have to ask the Windows Admin to make the changes
> via group policy, as i cannot modify.
>

pinger uses ICMP echo, so that is optional.

The other parts of ICMP which control TCP routing, path MTU, IP
discovery / ARP and such things which are critical.

See <https://tools.ietf.org/html/rfc4890> for guidelines
and <https://sites.google.com/site/ipv6center/icmpv6-is-non-optional>
for a case study on why those guidelines need to be followed.

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users