Off-Cluster Kubernetes Logging With Sumo Logic and Logentries

9 Sep 2016

One of the best parts of my job as a solutions architect for Deis is working with an amazing array of talented engineers at companies solving truly interesting problems.

I was recently working with a company on the forefront of wearable fitness trackers. Their modest but world-class engineering team had reached the outer limits of what could be done with Ansible-based Docker deployments in AWS.

While everything worked, there were areas of API entanglement, and a lack of orchestration that created duplicated effort and an inefficient use of EC2 resources. The company is clearly on a rocketship growth trajectory, so scaling and efficient systems management are forefront on everyone's mind.

Fortunately, they also recognized that the time to pivot to more efficient and scalable architecture is while they're still in an early growth phase.

Kubernetes provides the perfect fit for their use-case because it allows a more atomic service distribution, scaling, and painless service discovery. Also, when the infrastructure below the cluster is configured with autoscaling, rapid growth should be no problem.

In this blog post, I'll take a look at one aspect of the work I did with them: how I got logs shipped off-cluster to Sumo Logic. I will also draw a link to some work I did for another company to send logs to Logentries.

Sending Kubernetes Logs to Sumo Logic

Moving their applications to the Kubernetes ecosystem was reasonably straightforward since the team had already moved everything to Docker, and for the most part, applications adhered to 12-Factor best practices.

Their logging approach prior to implementing Kubernetes was syslog paired Sumo Logic, which was not a good fit for the new infrastructure. What I chose to do instead was implement the Sumo Logic collector (the tool responsible for collecting logs) as a daemon set.

Surprisingly, there didn't seem to be any existing solutions for this, so I went ahead and created a repository called SumoKube with the service account and daemon set YAML files, along with a custom-built container image with the corresponding sumo-sources.json file in it. This way, the collector on each node is pulling all the container logs.

There are several different ways to tailor the configuration, as documented in the repository. But for the most part, all you need are account credentials. The defaults presented in both repositories will get you started.

After installing the daemon set, all of the current nodes in your cluster will show up in Sumo Logic. This is particularly valuable for clusters with worker nodes configured in an AWS autoscaling group since they will automatically show up once bootstrapped:

If you've gotten this far, you're essentially running production-grade off-cluster logging that makes aggregating, reporting, and analysis an order of magnitude easier than DIY solutions.

Off-cluster logging is also extremely helpful for infrastructures that need regulatory-compliant log handling. Although clusters that store sensitive data in their logs, such as PHI or PII, should investigate more involved record storage options.

Sending Kubernetes Logs to Logentries

Coincidentally, I have been in contact with the folks over at Logentries as part of another client engagement and ran into the same issue of there being very little information for running Logentries in Kubernetes.

So, I went ahead and created leKube which is similar to SumoKube in that it runs the Logentries collector as a daemon set.

In this case, I opted to use the Docker socket to collect logs since that is already supported in their base container image. This choice is less preferable than collecting from /var/log/containers, as was done for Sumo Logic, because any container connecting to the Docker socket represents a possible vector for executing privileged commands on the node host. Docker provides a good tutorial on securing the socket here.

The image below is a screenshot from the demo Logentries account I sent my cluster logs to. It was remarkably fast to start ingesting the logs, and the interface is also relatively easy to understand.

Just as we saw with Sumo Logic, the path to searchable, usable Kubernetes logs in Logentries is astonishingly straightforward.

Daemon sets are invaluable for use cases like this, and prove that Kubernetes has solved many of the problems that typically plague Site Reliability Engineers when managing distributed applications and clusters with ephemeral hosts.

From zero to logs being available was less than 5 minutes.

Wrap Up

Seeing Kubernetes flourish in real-world cases is always a wonderful experience. Hopefully you can experience for yourself how easy it is to get off-cluster logging up and running with SumoKube or leKube. And, if you've been hesitating to try Kubernetes because of logging integration concerns, wait no more.

What's more, both Sumo Logic and Logentries offer trial accounts, so you can quickly get set up and see whether this sort of approach is right for you.

This underscores one of the advantages of working with an infrastructure that is as flexible as it is powerful. You can rapidly prototype solutions without having to completely redesign your underlying infrastructure.

Posted in Kubernetes, logging, Sumo Logic, Logentries

triangle square circle

Did you enjoy this post?