Launching Beamstack at the Beamsummit conference, Google Campus, Sunnyvale CA.

Beamstack is an open-source framework currently under development, aimed at facilitating the deployment of Machine Learning and Gen AI workflow pipelines with Apache Beam on Kubernetes.

Introduction

The goal of Beam Summit is to connect a community of professionals around the world who use, contribute, and are learning Apache Beam. This annual conference provides space to share use cases, performance, and resource optimizations, discuss pain points, and talk about the benefits of implementing Apache Beam in organizations. The event aims to bring together the Apache Beam community to discuss the project’s status, its technical advances, and its future.

Contents are focused on sharing:

  • New use cases from companies using Apache Beam.
  • Community-driven talks.
  • Technical deep dives.
  • In-depth workshops.

At the beamsummit conference, Beamstack was introduced and launched as an open-source framework aimed at facilitating the deployment of Machine Learning and GenAI workflow pipelines with Apache Beam on Kubernetes. Beamstack was introduced by MavenCode ML Engineers and Product managers in 2 separeate presentations.

  • The first presentation titled “Beamstack: An Open source Framework for running Machine Learning Pipelines with Apache Beam” delivered by Olufunbi Babalola and Mathew Fait introduced beamstack, the product roadmap, how to use beamstack and a call for participation.
  • The second presentation titled “A Low Code Structured Approach to Deploying Apache Beam ML Workloads on Kubernetes using Beamstack” delivered by Charles Adetiloye and Nate Salawe discussed a low code structured approach to deploying apache beam ML workloads on Kubernetes using Beamstack

Beamstack: An Open source Framework for running Machine Learning Pipelines with Apache Beam

Agenda

  • Overview of Beamstack
  • Why do you need Beamstack?
  • Beamstack Architecture
  • How to use Beamstack
  • Product Roadmap
  • Call for Participation

Overview of Beamstack

Beamstack is an open-source framework currently under development, aimed at facilitating the deployment of Machine Learning and GenAI workflow pipelines with Apache Beam on Kubernetes. Beamstack provides a robust Command Line Interface (CLI) that can potentially reduce pipeline deployment complexity and timelines drastically. It also possesses great monitoring and visualization features.

Why do you need Beamstack?

  • Configurable Deployment Environment: With minimal steps you can setup Beamstack on your local minikube, bare metal or cloud infrastructure
  • Ease of Deployment with Beam Low-code YAML: Beamstack adopts a low-code approach towards pipeline deployment which makes the process easier and faster
  • Composable and Reusable Pipeline Components: Reusable pipeline components designed for easy composition and customization, enabling efficient workflow creation and deployment.
  • Collaborative Setup for Development Teams: Beamstack’s modular architecture facilitates collaborative setup’s for various technical teams within an organization

Beamstack Architecure

How to use Beamstack

  • Install the binary: Install the beamstack binary from the releases section on the beamstack Github repository
  • Configre the target environment: Beamstack initializes your target kubernetes cluster with the necessary components
  • Select YAML package to deploy: Run and deploy your YAML pipeline by running the beamstack deploy command
  • Monitor your running jobs: Open the grafana dashboard or runner UI to observe your running jobs

Beamstack Roadmap

Feature Implementation

Tasks Breakdown

Call for Participation

We would like to encourage you to join our discord channel to engage with fellow beamstack developers, become a contributor and drive adoption of beamstack

A Low Code Structured Approach to Deploying Apache Beam ML Workloads on Kubernetes using Beamstack

Agenda

  • Overview of Beamstack
  • Architecture Overview
  • Key Features of Beamstack
  • Beamstack Use Cases / Demos
  • Future Roadmap

Overview of Beamstack

Beamstack is an open-source framework currently under development, aimed at facilitating the deployment of Machine Learning and GenAI workflow pipelines with Apache Beam on Kubernetes. Beamstack provides a robust Command Line Interface (CLI) that can potentially reduce pipeline deployment complexity and timelines drastically. It also possesses great monitoring and visualization features.

Architecture Overview

Key Features of Beamstack

Beamstack Use Cases & Demos

Beamstack can currently used in the following ways

Example Use Case: Creating Text Embedding + Saving it to Vector Database

Beamstack Demo

Future Roadmap