Engineering blog

Aug 2021

Spark for ML data preprocessing

In this post, we look into using Spark as a way to speed up feature extraction and data preprocessing for ML models.
Mar 2021

How to use Makefiles to run a simple Map Reduce Data Pipeline

This post examines a straightforward way of running a parallel pipeline with the resources at hand, using a simple and established tool: Make
Nov 2020

API design for cross-team collaboration

BenevolentAI utilizes various API technologies to enable cross-team collaboration and automation. This blog discusses those and provides general guidelines for designing well-engineered and user-friendly APIs.
Aug 2020

Spark on Kubernetes for NLP at scale

Our team discusses how spark and kubernetes are used to ingest, normalize and process millions of scientific papers for NLP.
Jul 2020

Using Airflow With Kubernetes at BenevolentAI

At BenevolentAI we work with a Kubernetes based infrastructure that is completely containerized. Using Kubernetes we run many sophisticated workflows for data processing, model training, model serving and metrics and evaluation.
Jun 2020

Deploying MetalLB In Production

MetalLB is a software load-balancer for bare metal Kubernetes clusters. In this post we will go through the process by which we deployed MetalLB to our production cluster at BenevolentAI.
Jun 2020

It Takes Two: The Benefits Of Pair Working

BenevolentAI’s System Software Engineer, Rory, explores the benefits of pair working on research and decision making. In software development, this collaborative way of working can help build team support and cohesion, and ultimately help teams to excel.