[lug] Troubleshooting Transient DNS failures

Chris McDermott csmcdermott at gmail.com
Wed Jan 16 13:49:12 MST 2019


I've used the check_dns Nagios plugin with great success.

https://www.monitoring-plugins.org/doc/man/check_dns.html

I think you can grab it on its own, or it will come bundled with the
nagios-plugins CentOS package. The simplest thing would be to stick it in
cron and append all the output to a file somewhere. There are plenty of
ways to get more fancy than that.

Chris

On Wed, Jan 16, 2019 at 2:06 PM Bear Giles <bgiles at coyotesong.com> wrote:

> Could you insert your own caching DNS server?
>
> You could sell it by pointing out that DNS queries are a common way to
> exfiltrate data via the payload. (In fact many of us are suspicious that
> that's why there was such heavy DNS traffic between the Trump campaign and
> Alfa Bank.) Putting in a caching DNS server can go a long way towards
> blocking that if your firewall blocks all traffic on port 53 except traffic
> to/from your caching DNS server.
>
> Of course it's also a way to send all attempts to hit the facebook servers
> into the great bit bucket in the sky. :-)
>
> On Wed, Jan 16, 2019 at 9:41 AM George Sexton <georges at mhsoftware.com>
> wrote:
>
>> On 2019-01-16 09:37, Rob Nagler wrote:
>> > Not exactly the answer you are looking for, but...
>> >
>> > Over the years I've found my colo/ISPs had issues with DNS so I
>> > switched to Google and now CloudFlare. I find them to be extremely
>> > reliable, and moved to CloudFlare, since they aren't Google. Google
>> >
>> > Your failure is not likely to be local so spending time debugging it
>> > may be a total waste of time. Pop and swap. :)
>>
>> Unfortunately, this is in a corporate setting and that would go over
>> like the proverbial lead zeppelin.
>>
>>
>> > One time we were having issues with TCP DNS. One of their caching
>> > servers was misconfigured. Another time, certain sites (e.g.
>> > github.com [1]) were just failing randomly, which would mess up builds
>>
>> Yes. I've got Jenkins jobs that are randomly failing and it makes the
>> devs unhappy.
>>
>> I understand I have no control, but a) they ask me why their job failed
>> and b) I'm concerned about the perception that the CI/CD pipeline is
>> unreliable, and it's my job to make it reliable...
>>
>> > randomly. As you say, you don't have control of the DNS servers, so
>> > you're going to be debugging your ISP/colo (read: maybe the same one
>> > :).
>> >
>> > Rob
>> >
>> >
>> >
>> > Links:
>> > ------
>> > [1] http://github.com
>> > _______________________________________________
>> > Web Page:  http://lug.boulder.co.us
>> > Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
>> > Join us on IRC: irc.hackingsociety.org port=6667
>> > channel=#hackingsociety
>>
>> --
>> George Sexton
>> MH Software, Inc. - Home of connectDaily Web Calendar
>> https://www.mhsoftware.com/
>> _______________________________________________
>> Web Page:  http://lug.boulder.co.us
>> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
>> Join us on IRC: irc.hackingsociety.org port=6667 channel=#hackingsociety
>
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: irc.hackingsociety.org port=6667 channel=#hackingsociety
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20190116/d63b2d36/attachment.html>


More information about the LUG mailing list