dns failover failing with 3.4.7

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

dns failover failing with 3.4.7

talikarni
Running into an issue, using the squid.conf entry
dns_nameservers 72.x.x.x 72.x.y.y

These are different servers (under our control) for the purpose of
filtering than listed in resolv.conf (which are out of our control, used
for server IP routing by upstream host).

The problem we found like this weekend is if the primary listed dns
server is unavailable, squid fails to use the secondary listed server.
Instead it displays the "unable to connect" type messages with all
websites.

How do we fix this so if primary fails it goes to secondary (and
possibly tertiary)?

Thanks in advance
Mike

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: dns failover failing with 3.4.7

Amos Jeffries
Administrator
On 28/07/2015 8:38 a.m., Mike wrote:

> Running into an issue, using the squid.conf entry
> dns_nameservers 72.x.x.x 72.x.y.y
>
> These are different servers (under our control) for the purpose of
> filtering than listed in resolv.conf (which are out of our control, used
> for server IP routing by upstream host).
>
> The problem we found like this weekend is if the primary listed dns
> server is unavailable, squid fails to use the secondary listed server.
> Instead it displays the "unable to connect" type messages with all
> websites.

Details please. How do you know the secondary is not even being tried?

What is Squid getting back from the primary when its "down" ?
 or just dns_timeout being hit?

Add this to squid.conf to get a cache.log trace of the DNS activity:
  debug_options 78,6


Amos

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: dns failover failing with 3.4.7

talikarni
On 7/27/2015 17:25 PM, Amos Jeffries wrote:

> On 28/07/2015 8:38 a.m., Mike wrote:
>> Running into an issue, using the squid.conf entry
>> dns_nameservers 72.x.x.x 72.x.y.y
>>
>> These are different servers (under our control) for the purpose of
>> filtering than listed in resolv.conf (which are out of our control, used
>> for server IP routing by upstream host).
>>
>> The problem we found like this weekend is if the primary listed dns
>> server is unavailable, squid fails to use the secondary listed server.
>> Instead it displays the "unable to connect" type messages with all
>> websites.
> Details please. How do you know the secondary is not even being tried?
>
> What is Squid getting back from the primary when its "down" ?
>   or just dns_timeout being hit?
>
> Add this to squid.conf to get a cache.log trace of the DNS activity:
>    debug_options 78,6
>
>
> Amos
>
> _______________________________________________
> squid-users mailing list
> [hidden email]
> http://lists.squid-cache.org/listinfo/squid-users
>
Amos,

If it was using the secondary server listed, connections to almost all
websites would not be failing to load if primary was down. For the test
we temporarily took the primary DNS server offline (per the example
above 72.x.x.x), and no websites would load unless it was in the squid
cache, but any elements that required additional data failed to load
causing formatting issues with the displayed website. If we swap the
setting to the "secondary" with the first IP (per the example above) as
dns_nameservers 72.x.y.y 72.x.x.x
and it works the same way, take the ".y.y" down and it refuses to use
the secondary listed IP ".x.x" for DNS, instead displays the website
could not be displayed error in the browsers. We even tried another test
(per the example above) dns_nameservers 72.x.x.x 8.8.8.8 then let it run
for an hour or so. Then we took down the primary which means it should
use the secondary google IP of 8.8.8.8, but it doesn't, goes right back
to the "website could not be displayed" error in the browsers.
I was wonder if this might be a bug. This is happening on multiple
servers, one has squid 3.4.7, another has 3.4.6 and problem occurs on both.

Thanks
Mike




_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: dns failover failing with 3.4.7

Amos Jeffries
Administrator
On 31/07/2015 3:48 a.m., Mike wrote:

> On 7/27/2015 17:25 PM, Amos Jeffries wrote:
>> On 28/07/2015 8:38 a.m., Mike wrote:
>>> Running into an issue, using the squid.conf entry
>>> dns_nameservers 72.x.x.x 72.x.y.y
>>>
>>> These are different servers (under our control) for the purpose of
>>> filtering than listed in resolv.conf (which are out of our control, used
>>> for server IP routing by upstream host).
>>>
>>> The problem we found like this weekend is if the primary listed dns
>>> server is unavailable, squid fails to use the secondary listed server.
>>> Instead it displays the "unable to connect" type messages with all
>>> websites.
>> Details please. How do you know the secondary is not even being tried?
>>
>> What is Squid getting back from the primary when its "down" ?
>>   or just dns_timeout being hit?
>>
>> Add this to squid.conf to get a cache.log trace of the DNS activity:
>>    debug_options 78,6
>>
> Amos,
>
> If it was using the secondary server listed, connections to almost all
> websites would not be failing to load if primary was down. For the test
> we temporarily took the primary DNS server offline (per the example
> above 72.x.x.x), and no websites would load unless it was in the squid
> cache, but any elements that required additional data failed to load
> causing formatting issues with the displayed website. If we swap the
> setting to the "secondary" with the first IP (per the example above) as
> dns_nameservers 72.x.y.y 72.x.x.x
> and it works the same way, take the ".y.y" down and it refuses to use
> the secondary listed IP ".x.x" for DNS, instead displays the website
> could not be displayed error in the browsers. We even tried another test
> (per the example above) dns_nameservers 72.x.x.x 8.8.8.8 then let it run
> for an hour or so. Then we took down the primary which means it should
> use the secondary google IP of 8.8.8.8, but it doesn't, goes right back
> to the "website could not be displayed" error in the browsers.
> I was wonder if this might be a bug. This is happening on multiple
> servers, one has squid 3.4.7, another has 3.4.6 and problem occurs on both.
>

Thank you exactly the kind of answer I was looking for question #1.
(Evidence that the problem is what you think it is before digging for a
cause).

Kind of answers Q2 a "nothing", implying that Q3 is "yes" dns_timeout is
happening.


 Is your dns_timeout (default 30 sec *total* DNS lookup timeout) larger
than your dns_retransmit_interval (default 5 sec per-query timeout) setting?


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: dns failover failing with 3.4.7

talikarni
On 7/30/2015 16:30 PM, Amos Jeffries wrote:

> On 31/07/2015 3:48 a.m., Mike wrote:
>> On 7/27/2015 17:25 PM, Amos Jeffries wrote:
>>> On 28/07/2015 8:38 a.m., Mike wrote:
>>>> Running into an issue, using the squid.conf entry
>>>> dns_nameservers 72.x.x.x 72.x.y.y
>>>>
>>>> These are different servers (under our control) for the purpose of
>>>> filtering than listed in resolv.conf (which are out of our control, used
>>>> for server IP routing by upstream host).
>>>>
>>>> The problem we found like this weekend is if the primary listed dns
>>>> server is unavailable, squid fails to use the secondary listed server.
>>>> Instead it displays the "unable to connect" type messages with all
>>>> websites.
>>> Details please. How do you know the secondary is not even being tried?
>>>
>>> What is Squid getting back from the primary when its "down" ?
>>>    or just dns_timeout being hit?
>>>
>>> Add this to squid.conf to get a cache.log trace of the DNS activity:
>>>     debug_options 78,6
>>>
>> Amos,
>>
>> If it was using the secondary server listed, connections to almost all
>> websites would not be failing to load if primary was down. For the test
>> we temporarily took the primary DNS server offline (per the example
>> above 72.x.x.x), and no websites would load unless it was in the squid
>> cache, but any elements that required additional data failed to load
>> causing formatting issues with the displayed website. If we swap the
>> setting to the "secondary" with the first IP (per the example above) as
>> dns_nameservers 72.x.y.y 72.x.x.x
>> and it works the same way, take the ".y.y" down and it refuses to use
>> the secondary listed IP ".x.x" for DNS, instead displays the website
>> could not be displayed error in the browsers. We even tried another test
>> (per the example above) dns_nameservers 72.x.x.x 8.8.8.8 then let it run
>> for an hour or so. Then we took down the primary which means it should
>> use the secondary google IP of 8.8.8.8, but it doesn't, goes right back
>> to the "website could not be displayed" error in the browsers.
>> I was wonder if this might be a bug. This is happening on multiple
>> servers, one has squid 3.4.7, another has 3.4.6 and problem occurs on both.
>>
> Thank you exactly the kind of answer I was looking for question #1.
> (Evidence that the problem is what you think it is before digging for a
> cause).
>
> Kind of answers Q2 a "nothing", implying that Q3 is "yes" dns_timeout is
> happening.
>
>
>   Is your dns_timeout (default 30 sec *total* DNS lookup timeout) larger
> than your dns_retransmit_interval (default 5 sec per-query timeout) setting?
>
>
> Amos
> _______________________________________________
> squid-users mailing list
> [hidden email]
> http://lists.squid-cache.org/listinfo/squid-users
>
We do not have a dns_timeout or retransmit entry in the squid.conf, only
using the dns_nameservers entry allowing the default timeout timeframes
since so few websites should take much longer than that to load, and if
they are, it is a misspelled URL, a foreign server (which so few of our
customers use), or likely having issues anyways.

I suspect this may be a bug with squid 3.4.x since this issue happened
on 2 different squid servers, one is 3.4.6, another is 3.4.7. Yet on the
backups to each, one has 3.5.1 and other has 3.5.6 (I updated it today),
and they are not affected by this, both of these squid v3.5.x servers
properly see the primary is not reachable and uses the secondary DNS IP.

Thanks Amos,


Mike

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: dns failover failing with 3.4.7

Amos Jeffries
Administrator
On 31/07/2015 10:39 a.m., Mike wrote:
> I suspect this may be a bug with squid 3.4.x since this issue happened
> on 2 different squid servers, one is 3.4.6, another is 3.4.7. Yet on the
> backups to each, one has 3.5.1 and other has 3.5.6 (I updated it today),
> and they are not affected by this, both of these squid v3.5.x servers
> properly see the primary is not reachable and uses the secondary DNS IP.
>

Yes I agree with that conclusion. Good to know its fixed, whatever it
was. :-)

Cheers
Amos

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users