Anzr’s Blueprint No. 1 - Data Visualization Platform

Anzr’s Blueprint No. 1 - Data Visualization Platform

We at Anzr, a company focusing on cyber security and infrastructure as code, has developed blueprints for critical components supporting devsecops on public and private infrastructures. The blueprints are made available on a high level for Anzr’s followers and can be designed and implemented into our customers’ environments. If you identify a need in this area, feel free to contact us! First blueprint released is Anzr’s Data Visualization Platform!

Why have we designed a new Data Visualization Platform?

Data visualization of logs, metrics and events is key for any traditional and devsecops based service delivery. It is imperative for development and secure operations of applications and infrastructure whether you are building micro services or traditional applications and running workloads on Azure, AWS, GCP, OpenShift or your own Kubernetes or virtualized server infrastructure.

The challenges for many organizations are to:

  1. gather all data and secure it both in transit and at rest,
  2. get a scalable platform, and configured as infrastructure as code,
  3. create structured data out of large volumes of unstructured data,
  4. build intelligent and comprehensive visualization of the data, and
  5. work on continuous improvements and life cycle management

Anzr’s Data Visualization Platform has focused on solving these areas.

Data visualization is well known to be an imperative tool for secure service deliveries but is surprisingly underdeveloped many times. In the following interview, Anzr’s co-founders Robert Teir and Magnus Blom discuss some fundamental aspects of Anzr’s Data Visualization Platform and key design choices.

Ok Robert, what is data visualization in short?

Data visualization in our context is the secure, intelligent, comprehensive, and correlated presentation of logs, metrics and events gathered in our customers’ applications and infrastructure.

Can you give Anzr’s view on the key challenges and needs for any organization in this area?

Our takeaway from several years’ experience is that monitoring and logging is implemented on a pretty basic level in most instances, despite the fact of it being the most important tool to observe your applications and infrastructure and ensure traceability.

It is often also traditional monitoring stacks that are implemented and there are several technologies now that offer better flow and caching, transformation, correlation, and visualization.

Many monitoring and logging systems does not scale. They are deployed in large, central instances where you get a problem to address all requirements and wishes from the users. Our Data Visualization Platform is deployed in smaller instances for each team and is therefore also adaptable to the requirements and wishes of each team. We address performance by keeping some services as a central instance.

Can you give a high-level description of Anzr’s Data Visualization Platform?

Our Data Visualization Platform is built on open standards and well-established open software.

Data collection is done with Beats via Kafka and Logstash to Elasticsearch or directly from for example applications and containers to Kafka. Beats support all data formats commonly needed and gathers data from applications, container nodes and servers. The platform can of course also collect data from other monitoring sources, such as existing agents in the environment. We also have many years of experience working with developers and system engineers to support them in addressing metric points in their applications and infrastructure.

We use Kafka as a buffer layer to reduce the risk of congestion in case of high transaction volumes. Our experience is that when applications and infrastructure start producing large volumes of data, often at critical situations, the existing monitoring platforms doesn’t always manage the workload. Kafka is a good way for us to address this challenge. Strimzi enables us to deploy Kafka on Kubernetes more easily.

Logstash then gather the data, transforms it, and sends it to Elasticsearch. Creating structured data out of large volumes of unstructured data is very important and we have good experience in these types of configurations.

The actual visualization is then done and presented with Kibana. We have worked a lot with Grafana as well but like Kibana more nowadays. There is also a logic in this when working with Elasticsearch of course. We build different dashboards for our customers according to their needs.

The platform is designed to be scalable with the possibility for different teams to subscribe to different data via separately deployed instances of Logstash, Elasticsearch and Kibana. Kafka is normally deployed as a central instance for scalability and segregation of duties.

The platform can be used for all sizes of environments. For large, enterprise environments we build a central Kafka service layer and then deploy instances of Logstash, Elasticsearch and Kibana for different teams to access and manage themselves. Everything is deployed as infrastructure as code with Argo CD or the CI/CD tooling of the customer’s choice.

Can you describe how the platforms scales and is deployed?

We have designed it to run on Kubernetes, whether it being AKS, OpenShift, vanilla Kubernetes etc. Kafka is run as a central service layer, with separate Logstash, Elasticsearch and Kibana instances provisioned. This gives the benefit of a robust data collection layer and the possibility for different teams to in a secure manner manage components relevant to their own need for data visualization. It is important in a large enterprise environment with several teams managing infrastructures and workloads.

All configurations are of course done as infrastructure as code with versioning in a repository of the customer’s choice. Pipelines for deployment can for example be done with Argo CD.

Is the platform relevant for workloads on for example Azure?

It depends on several factors, but the general answer is yes. Azure Monitor, Sentinel and Log Analytics are fantastic tools that we have worked with as well for several years.

But if you want to collect large volumes of data, perhaps encrypt it in a more cost-efficient and secure manner, and if you have a hybrid infrastructure, then our platform will be a good choice, either stand alone or integrated to Azure’s services.

What are our technical security controls for information being accessed as well as transported and stored in a secure and tamperproof manner?

One key component is that all communications between components in the platform and to end-points are MTLS encrypted and managed thru a robust CA service. We can also encrypt all data at rest. That is all the details we give in this channel.

Access to the Data Visualization Platform is done with the customers’ access management system, where we also can provide robust designs but that is another topic.

The Data Visualization Platform is hardened, always designed with a write once principle etc. It is as mentioned before deployed as infrastructure as code so what we perform backup on is only the repository, not the actual production environment itself.

We can also design and implement immutable backups of the data stored. The customer might want long-term backups on audit logs and other information but can accept a shorter retention time on data not needed beyond for example 90 days.

What are the costs? Is it expensive to implement and run?

We of course charge for the design and implementation of the platform, and can offer support agreements, but remember that we choose to work with open components in this case. This enables us to support the community and produce best in class and cost-efficient solutions for our customers, who own and can manage the platform by themselves.

Ok, final question. If a customer has other ideas on certain components, can we work with that and help them?

Yes. And it is also common to use the platform as a source for the customer’s SIEM system, produce specialized reports for audit logs etc.

That concludes this short brief and high-level summary of Anzr’s Data Visualization Platform. Low-level designs and more details are shared with customers that we actively work with. The next blueprint will be Anzr’s Secure Access Platform. Stay tuned for that!

If you are interested in learning more about Anzr and our blueprint catalogue, please feel free to reach out to our co-founders.

Magnus Blom, magnus@anzr.co, +46 (0)70 454 76 33
Robert Teir, robert@anzr.co, +46 (0)73 525 51 84

Back
20 October, 2022