top of page

How to Build Your Own DNS Sinkhole and DNS Logs Monitoring System

Updated October 26, 2023; blog post originally published on February 5, 2018 by Ben Hughes.

Pi-Hole Worldwide

Recently, I’ve been playing around with Pi-hole, an increasingly popular network adblocker designed to run on a Raspberry Pi. Pi-hole functions as your network’s DNS server, allowing it to block ad domains, malicious domains, and other domains (or TLD wildcards) that you add to its block lists -- effectively turning it into an open source, lightweight DNS sinkhole. This blocking occurs at the network level, meaning blocked resources never even reach your endpoint’s browser. Along with caching, this can increase website load performance and block ads that are difficult to block client-side such as in-app ads on an Android or iOS device.

Pi-hole also logs each DNS event, including domain resolutions and blocks. DNS logs are a gold mine that is sadly often overlooked by network defenders. Examples of malicious network traffic that can be identified in DNS logs include command and control (C2) traffic from a variety of malware including ransomware; malicious ads and redirects; exploit kits; phishing; typosquatting attacks; DNS hijacking; denial of service (DoS) attacks; and DNS tunneling.

While BIND and Windows DNS servers are perhaps more popular DNS resolver implementations, Pi-hole uses the very capable and lightweight dnsmasq as its DNS server. And while Pi-hole includes a nice web-based admin interface, I started to experiment with shipping its dnsmasq logs to the Elastic (AKA ELK) stack for security monitoring and threat hunting purposes. In the end, I quickly prototyped a Pi-hole based DNS sinkhole deployment, DNS log pipeline, and accompanying DNS log monitoring system thanks to Pi-hole’s dnsmasq implementation, the ELK (Elasticsearch, Logstash, and Kibana) stack, and Beats. This project is still a work in progress in my lab, but I thought I would share what I’ve learned so far. The steps are not difficult, but this guide assumes you have at least a basic familiarity with Linux commands, DNS logs, and the ELK stack.


Pi-Hole is a DNS server / network adblocker / DNS sinkhole that is designed to run on minimal hardware including the Raspberry Pi. If you plan to use a Raspberry Pi keep in mind the DNS logs must be shipped using RSyslog to Logstash (Another blog post will provide steps on how to do this). We will be covering the setup process for a Pi-Hole VM and Raspberry Pi OS. I installed Pi-hole in a Ubuntu 22.04 Server VM. Typically, Pi-hole runs fine with only 1 CPU core and 512 MB RAM, though I allocated more to account for log shipping overhead. Pi-hole is suitable for SOHO and SMB networks, with reports of success in networks containing 100s of endpoints.

Pi-hole installation and configuration are well documented elsewhere, so I won’t dwell on the details here. You can actually install Pi-hole with a 1-line command (curl -sSL | bash), though of course it is always a good security practice to review the script before executing it. Running the install script walks you through the initial setup, where you can assign a static IP address to the Pi-hole server, choose your upstream DNS resolution service (I recommend a security and privacy oriented solution such as OpenDNS or Quad9), and enable the web admin interface.

Pi-hole: A Black Hole for Internet Advertisements

The admin password is displayed at the end of the install script, though you can always change it later. Once installed, you can review the excellent Pi-hole dashboard and take care of most administrative tasks by logging into its web interface at:

Pi-Hole Admin Dashboard

Once Pi-hole is up and running, you need to point your endpoints to your Pi-hole server’s IP address (which should be static) so that they will use the Pi-hole for DNS resolution going forward. You can set this manually per device. You can also configure most routers to use the Pi-hole as the DNS server. Beyond functioning as your network’s DNS server, Pi-hole (again thanks to dnsmasq) can also be a DHCP server. There are various pros and cons such as endpoint IP visibility that apply to these different deployment options, so read up on Pi-hole’s relevant documentation for more details. By default, Pi-hole leverages several ad blocklists, though you are free to add your own lists and domains or wildcards via the web interface or command line.

Pi-Hole Domain Management Screen

DNS Logs Pipeline

By default, Pi-hole stores its dnsmasq logs at /var/log/pihole/pihole.log. Beyond glancing at the Dashboard metrics and top lists, there are several ways to manually review these logs, including the “Query Log” area of the web interface, “Tail pihole.log” under Tools in the web interface, and directly via SSH access to the underlying server running Pi-hole (e.g., tail /var/log/pihole/pihole.log -f). Based on the Dashboard metrics, I know that on most days, my Pi-hole lab deployment blocks an average of about 10% of the total domain resolutions, and Windows 10 telemetry subdomains are often the most blocked DNS requests (which is great because it is otherwise somewhere between extremely difficult and impossible to disable such Win10 telemetry). While such information is useful, we can ship these valuable DNS logs to a centralized location for log enrichment and monitoring purposes, including security analytics and threat hunting.

Below is an example of raw dnsmasq logs from pihole.log:

Oct 25 12:28:13 dnsmasq[616]: 195649 query[A] from
Oct 25 12:28:13 dnsmasq[616]: 195649 forwarded to
Oct 25 12:28:13 dnsmasq[616]: 195649 reply is <CNAME>
Oct 25 12:28:13 dnsmasq[616]: 195649 reply is
Oct 25 12:28:13 dnsmasq[616]: 195650 query[A] from
Oct 25 12:28:13 dnsmasq[616]: 195650 cached is
Oct 25 12:28:13 dnsmasq[616]: 195651 query[AAAA] from
Oct 25 12:28:13 dnsmasq[616]: 195651 forwarded to
Oct 25 12:28:13 dnsmasq[616]: 195651 reply is 2a04:4e42:77::773
Oct 25 12:28:13 dnsmasq[616]: 195652 query[A] from
Oct 25 12:28:13 dnsmasq[616]: 195652 gravity blocked is
Oct 25 12:28:13 dnsmasq[616]: 195653 query[A] from
Oct 25 12:28:13 dnsmasq[616]: 195653 gravity blocked is
Oct 25 12:28:13 dnsmasq[616]: 195654 query[A] from
Oct 25 12:28:13 dnsmasq[616]: 195654 gravity blocked is
Oct 25 12:28:13 dnsmasq[616]: 195655 query[A] from
Oct 25 12:28:13 dnsmasq[616]: 195655 gravity blocked is
Oct 25 12:28:13 dnsmasq[616]: 195656 query[A] from
Oct 25 12:28:13 dnsmasq[616]: 195656 gravity blocked is
Oct 25 12:28:13 dnsmasq[616]: 195657 query[A] from
Oct 25 12:28:13 dnsmasq[616]: 195657 gravity blocked is
Oct 25 12:28:13 dnsmasq[616]: 195658 query[A] from
Oct 25 12:28:13 dnsmasq[616]: 195658 forwarded to
Oct 25 12:28:13 dnsmasq[616]: 195659 query[A] from
Oct 25 12:28:13 dnsmasq[616]: 195659 gravity blocked is
Oct 25 12:28:13 dnsmasq[616]: 195658 reply is
Oct 25 12:28:13 dnsmasq[616]: 195661 query[A] from
Oct 25 12:28:13 dnsmasq[616]: 195661 gravity blocked is
Oct 25 12:28:13 dnsmasq[616]: 195662 query[AAAA] from
Oct 25 12:28:14 dnsmasq[616]: 195684 forwarded to
Oct 25 12:28:14 dnsmasq[616]: 195684 reply is
Oct 25 12:28:14 dnsmasq[616]: 195685 query[A] from
Oct 25 12:28:14 dnsmasq[616]: 195685 cached is
Oct 25 12:28:14 dnsmasq[616]: 195685 cached is
Oct 25 12:28:14 dnsmasq[616]: 195685 cached is
Oct 25 12:28:14 dnsmasq[616]: 195685 cached is
Oct 25 12:28:14 dnsmasq[616]: 195686 query[AAAA] from
Oct 25 12:28:14 dnsmasq[616]: 195686 forwarded to
Oct 25 12:28:14 dnsmasq[616]: 195686 reply is NODATA-IPv6
Oct 25 12:28:15 dnsmasq[616]: 195687 query[PTR] from
Oct 25 12:28:15 dnsmasq[616]: 195687 forwarded to
Oct 25 12:28:15 dnsmasq[616]: 195688 query[PTR] from
Oct 25 12:28:15 dnsmasq[616]: 195688 forwarded to
Oct 25 12:28:15 dnsmasq[616]: 195689 query[PTR] from
Oct 25 12:28:15 dnsmasq[616]: 195689 forwarded to
Oct 25 12:28:15 dnsmasq[616]: 195687 reply is NXDOMAIN
Oct 25 12:28:15 dnsmasq[616]: 195688 reply is NXDOMAIN

You can see from the sample logs that one of my lab machines is using IP, that I am using the relatively new Quad9 (, get it?) as my upstream DNS provider in Pi-hole, and that there are multiple DNS requests that are probably related to Microsoft software.

These aren’t the prettiest types of logs I’ve ever seen -- essentially there are multiple lines for each type of DNS event -- but they get the job done and have a standardized syslog-style timestamp per line, which we’ll need for our log shipment pipeline to the ELK stack.

At a high level, this represents the log shipment pipeline I set out to prototype:

Endpoints (client DNS requests) > Pi-hole (DNS server/sinkhole) > Filebeat (log shipper) > Logstash (log shaper) > Elasticsearch (log storage and indexing backend) and Kibana (log analysis frontend)

Essentially, the endpoints use Pi-hole as their DNS server. Pi-hole logs dnsmasq events including domain resolutions and blocklist matches to a local log file. I opted to use Filebeat, one of Elastic’s lightweight log shippers, directly on the Pi-hole server to ship those dnsmasq logs in real-time to a Logstash server. I created some custom configs for Logstash in order to implement basic field mappings, implement an accurate timestamp, and enrich the logs by adding GeoIP location lookups for external IP addresses from resolved domains. Logstash then ships those processed logs to a separate Elasticsearch server for storage and indexing, with Kibana serving as the frontend on the same server for manual searches, visualizations, and dashboards.

As an aside, one reason I found this project interesting is because there seems to be plentiful Internet chatter on working with BIND and Microsoft DNS logs, but not nearly so much about dnsmasq logs. That said, although the DNS log pipeline described here is designed for Pi-Hole’s dnsmasq logs, it can be easily adapted for other types of DNS logs such as BIND and Microsoft.

Back to business. Let’s walk through each of the major parts of the DNS logs pipeline in more detail. This guide will not cover the installation and basic configuration of the ELK stack itself, as this is well documented elsewhere. For my testing, I installed Logstash on an Ubuntu 22.04 Server VM, and Elasticsearch and Kibana on a separate Ubuntu Server VM. My main advice for deploying ELK is to ensure you allocate plenty of RAM. Ensure that your Logstash, Elasticsearch, and Kibana servers are all operational and you know their static IPs before proceeding. For this project, I am using the 8.x versions of the ELK stack components.


First, we need to install Filebeat on the Pi-hole server. Note that while Pi-hole itself has minimal system requirements (typically runs fine with 1 core and 512 MB RAM), running Filebeat on the same server will generate some performance overhead. In my case, I erred on the side of caution, and allocated 2 cores and 2 GB RAM to the Pi-hole server to account for the FileBeat addition, but even that is likely overkill for a small deployment. CPU usage is miniscule and total RAM utilization is typically <10% on my Pi-hole server.

Since I am using Ubuntu Server, I can manually wget and install a 64-bit DEB package, or follow Elastic’s instructions for installing from the official repo. The process would be the same for other Debian based distros.

Once Filebeat is installed, I need to customize its filebeat.yml config file to ship Pi-hole’s logs to my Logstash server. You can either use the default Filebeat prospector that includes the default log location /var/log/*.log (modify it to include pihole folder, /var/log/pihole/*.log or /var/log/pihole/*.log), or specify /var/log/pihole/pihole.log to only ship Pi-hole’s dnsmasq logs. Keep in mind that Filebeat default paths config “/var/log/*.log” will not send Pi-Hole logs since they’re located in a folder called ‘pihole’

Filebeat default paths

We also need to point Filebeat to the Logstash server’s IP. I’m sticking with Logstash’s default port 5044.

Logstash Port 5044

Since I’m using Ubuntu 22.04 Server as the underlying OS for everything, the proper command to then start Filebeat manually is: sudo systemctl start filebeat. Filebeat will immediately start shipping the specific logs to Logstash. You can also configure Filebeat (as well as the ELK stack components) to start up automatically on boot.

Filebeat Logstash config file

While Filebeat requires minimal configuration to get started, Logstash configuration is much more involved. For my DNS logs pipeline, I installed Logstash on a dedicated Ubuntu Server VM. I named my custom config file as dnsmasq.conf, and ended up writing my own grok pattern filters to match on interesting dnsmasq logs in order to properly process and enrich them.

First, we specify the Logstash input in our custom config file, which is simply listening on its default port 5044 for logs shipped from Filebeat:

input {
    port => 5044
    type => "logs"
    tags => ["pihole","5044"]

Then we need to create a custom grok filter to match on the specific dnsmasq logs we are interested in. This has been the most time consuming part of this project, as there are multiple formats that dnsmasq logs take, and essentially a single DNS event gets broken into multiple lines. This is where I first learned about, an extremely useful web-based tool to build and test grok regular expression (regex) patterns. Through trial and error, I got a few basic matches working for DNS query and reply logs. There is clearly still work to be done; for example, a blacklisted domain and the originating client IP are logged on separate lines by dnsmasq (they are effectively separate logs), so addressing that remains on my to do list.

filter {

  if "pihole" in [tags] {
    grok {
      patterns_dir => ["/etc/logstash/patterns/"]
      match => {
                "message" => [
 "%{logdate:LOGDATE} dnsmasq\[(?<dnsmasq>\d+)\]: (?<type>reply|cached|query|query\[AAA\]|forwarded|query\[A\]|query\[AAAA\]|query\[HTTPS\]|query\[PTR\]|gravity blocked) %{domain:domain_request} (?<direction>is|from|to) %{IP:ip_response}",

 "%{logdate:LOGDATE} dnsmasq\[(?<dnsmasq>\d+)\]: (?<type>reply|cached|query|query\[AAA\]|forwarded|query\[A\]|query\[AAAA\]|query\[HTTPS\]|query\[PTR\]|gravity blocked) %{domain:domain_request} (?<direction>is|from|to) %{IPV6:ip_response}",

 "%{logdate:LOGDATE} dnsmasq\[(?<dnsmasq>\d+)\]: (?<type>reply|cached|query|query\[AAA\]|forwarded|query\[A\]|query\[AAAA\]|query\[HTTPS\]|query\[PTR\]) %{domain:domain_request} (?<direction>is|from|to) (?<ip_response>NODATA-IPv6|\<CNAME\>|NODATA|NXDOMAIN)"

# to do cached and cached reverse

     if [message] =~ "cached" and [message] =~ "NXDOMAIN" {
       mutate {
         add_tag => [ "cached NXDOMAIN" ]

     else if [message] =~ "NODATA" {
       mutate {
         add_tag => [ "NODATA" ]
     else if "reply" in [type] {
       mutate {
         add_tag => [ "reply" ]
    geoip {
      source => "ip_response"
      target => "ip_response_geo"
    date {
      match => [ "LOGDATE", "MMM dd HH:mm:ss", "MMM  d HH:mm:ss" ]

The above example grok patterns matches a majority of district types of dsnmasq logs, including initial DNS queries, replies, and blacklisted requests, etc.

You can see in my filter that I also specify a “patterns_dir”. In order to use custom patterns (which I have named the same as their respective fields in ALL CAPS) in a grok match, you must list them in a patterns file located in the specified directory.

The contents of my custom patterns file, which I simply saved to \patterns\dnsmasq:

logdate [\w]{3}\s[\s\d]{2}\s\d\d\:\d\d\:\d\d
blocklist [\/\w\.]+
domain [\w\.\-]+
clientip \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
ip \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
ipv6 ([0-9]|[a-f]|[A-F]){0,4}:{1,2}
FQDN \b(?:[\w-][\w-]{0,62})(?:\.(?:[\w-][\w-]{0,62}))*(\.?|\b)
DNSMASQPREFIX %{SYSLOGTIMESTAMP:date} %{SYSLOGPROG}: %{INT:logrow} %{IP:source_host}\/%{POSINT:source_port}

Note that I have not finished writing and perfecting grok patterns for all possible dnsmasq log types and fields. There are a few types of dnsmasq logs that I still need to address, and I’m sure refinements are needed for the somewhat crude-but-effective patterns I did write, to account for things like odd characters in domain names. See screenshots below of Grok constructor in action:

Grok Constructor 1
Grok Constructor 2
Grok Constructor 3

One issue I quickly ran into during testing was that the @timestamp field did not match my LOGDATE field once the logs arrived in Elasticsearch for indexing. LOGDATE represents the original timestamp of the dnsmasq event, while the @timestamp added in Elasticsearch represents the time the log was successfully shipped into Elasticsearch, which typically lags slightly behind the LOGDATE. Fortunately, Logstash’s date filter plugin makes it easy to fix this as follows:

    date {
      match => [ "LOGDATE", "MMM dd HH:mm:ss", "MMM  d HH:mm:ss" ]

Essentially the dnsmasq logs have 2 possible representations for their syslog-style timestamp field, which I have named LOGDATE, consisting of 2 digit days and single digit days preceded by an extra space. The date filter above normalizes this such that the @timestamp field will exactly match the original corresponding LOGDATE field and also append the current year.

With domain resolution lookups, you have their resulting IP addresses being logged by Pi-hole. Accordingly, we want to enrich our logs with GeoIP location data. Logstash’s geoip filter plugin has made this remarkably easy:

    geoip {
      source => "ip_response"
      target => "ip_response_geo"

What does this accomplish? Whenever this filter identifies an IP address for a resolvable domain, it enriches the document with GeoIP location data by adding various fields (drawing from the included Maxmind Lite database) like so:

GeoIP and other Geo Fields

With this GeoIP data, will be able to run searches and build Kibana visualizations such as maps based on where IPs are geolocated. The detailed guide of how to create Kibana visualizations with this GeoIP / geopoint data will be explained in a future blog post.

Finally, we need to configure Logstash to send these freshly shaped and enriched logs to the Elasticsearch server. In my sample Logstash config, it looks like this (be sure to specify your own IP and index naming convention preference):

output {
  elasticsearch {
    hosts => ["xxx"]
    user => "elastic"
    password => "${ES_PWD}"
    ssl_enabled => true
    ssl_certificate_authorities => "xxx.crt"
    index => "pihole-%{+YYYY.MM.dd}"
#    stdout { codec => rubydebug { metadata => true } }

Once the config file is ready, run Logstash and specify that it load our config file:

sudo bin/logstash -f dnsmasq.conf

Don’t be discouraged if Logstash throws an error related to your config file; read the error message carefully and fix your config accordingly. An errant or missing brace character or other typo is usually to blame in my experience.

Once Logstash is running, you should see something like the following, indicating that it is successfully listening on its default port for logs:

Logstash Successfully Listening

And if you enabled an stdout filter, processed logs will be output to the screen in real-time. This is often helpful for debugging problems with your grok filter or other parts of your overall log pipeline.

Before getting to this point, you should have Elasticsearch and Kibana installed and running on a separate server with plenty of RAM allocated. To ensure that our log pipeline is working properly from end to end, query Elasticsearch from the command line or web browser to list the relevant Logstash indices:

Logstash Indices

Once that is done, you can finish setting up your index in Kibana and start reviewing logs. In Kibana, go to Management > Index Patterns and finish creating a new index pattern corresponding to the index naming convention you configured in Logstash.

Index Patterns in Kibana

Be sure to use the @timestamp field as the “Time Filter field name”, click “Create index pattern” and you are all set to start working with the logs in Kibana.

Logs in Kibana

In my next post, I’ll share some sample Kibana searches, visualizations, and dashboards that make good use of our new and improved Pi-hole DNS logs for security monitoring and analytics. This includes a component template and ingest pipeline to add a GeoIP geopoint field for visualizations. In addition, the blog post will go into steps for shipping logs from a Raspberry Pi hole instance into Logstash.

I’ll also share additional lessons learned and recommended next steps for this project. In the meantime, you can find my sample configs on GitHub, with the caveat that they should still be considered mostly in beta stage at this point.

Polito - Cybersecurity Consulting

Polito Inc. offers a wide range of security consulting services including penetration testing, vulnerability assessments, red team assessments, incident response, digital forensics, threat hunting, and more. If your business or your clients have any cybersecurity needs, contact our experts and experience what Masterful Cyber Security is all about.

Phone: 571-969-7039




Commenting has been turned off.
bottom of page