Available for engagements

Besar Maxhuni

Building scalable ETL pipelines, data lakes, and distributed systems on AWS. Focused on cloud-native data platforms, analytics engineering, and FinOps-aware architectures.

View Projects Read My Articles

0+aws services in production

0+tools in the stack

0featured projects

0published articles

01 — Expertise & Tech Stack

Data flow interface

Core capabilities across AWS data services, distributed processing, and cloud-native engineering — focused on building reliable, cost-aware data platforms.

AWS Data Services

12 Services

Data lake

Redshift

Warehouse

Glue

ETL / Catalog

Lambda

Compute

Athena

Ad-hoc query

Kinesis

Streaming

Step Functions

Orchestration

EMR

Big data / Spark

QuickSight

BI dashboards

CloudWatch

Monitoring

Lake Formation

Lake governance

SageMaker

ML platform

Stack

13 Tools

PythonSQLSparkKafkadbtAirflowDockerTerraformPostgreSQLPandasPower BITableauExcel

FinOps

cost-aware design

Storage tiering · query optimization · right-sized compute

Education

M.S. CS

Big Data Science · ongoing

pipeline.topology

streaming

batch flowstream / ad-hocraw → curated → analytics

data-lake → warehouse → analytics·IaC + version-controlled pipelines

02 — Featured Projects

Active nodes

Operational data systems and analytical deep-dives — each ships with reproducible infrastructure or a documented query pipeline.

fraud-triage.servicePRODUCTION

Real-Time Fraud Triage System

End-to-end fraud detection workflow with a relational backbone and an interactive triage UI. Containerized for reproducible local and cloud deployment.

→PostgreSQL schema for transactions and triage events
→Streamlit dashboard for analyst review
→Docker-compose stack for one-command bring-up

PythonPostgreSQLStreamlitDocker

ai-impact-2030.analysisCASE STUDY

AI Impact Jobs 2030

Exploratory analysis of a Kaggle dataset projecting AI's impact on the labor market through 2030. SQL-driven aggregation surfaces sector-level shifts and exposure trends.

→Window functions for cohort-over-time comparison
→Sector clustering by automation exposure
→Reproducible SQL pipeline for re-runs

SQLKaggleData Analysis

reviews-etl.orchestratorPRODUCTION

AWS Serverless Big Data Pipeline

An automated ELT/ETL pipeline optimizing raw customer review metrics via a decoupled serverless compute layer and low-latency storage orchestration.

→Decoupled asynchronous query execution via the Redshift Data API to maximize cluster resource efficiency.
→PySpark schema enforcement converting unstructured source landing files into compressed Apache Parquet layouts.
→Least-privilege IAM with tightly scoped Redshift Data API permissions (GetCredentials, ExecuteStatement).

Step FunctionsGlue (PySpark)Redshift ServerlessS3

03 — Publications

Data logs

Long-form notes on the patterns I rely on day-to-day — built from real implementation experience, not just documentation.

publication_log::mediumnew

2026

FinOps: The New Superpower Every Data Engineer Needs

Why cloud cost management and data engineering have permanently merged in 2026 — the explosion in AI spend, FinOps moving from the back office to the boardroom, and the rise of the technical FinOps specialist who reads both the pipeline and the P&L.

FinOpsAI CostData EngineeringCloud

publication_log::medium

2026

Better Together: Integrating Amazon S3 and Redshift for Modern Analytics

A practical look at how S3 and Redshift complement each other in a modern analytics stack — from raw landing zones through curated, query-ready layers using Redshift Spectrum.

AWSS3RedshiftAnalytics

publication_log::medium

2026

Cloud Architecture in the Age of AI Agents: What's Actually Changing

How cloud infrastructure is being redesigned around AI that acts rather than responds — multi-agent orchestration, the MCP and A2A protocol layer, FinOps for always-on compute, and IAM built for non-human actors.

AI AgentsMCPFinOpsCloud Security