About

About

About Me

ML Engineer specialized in MLOps and cloud‑native AI systems. I design, containerize, and operate ML services that meet production SLOs for latency, throughput, availability, and cost. I bridge data science with platform engineering: reproducible training, efficient serving (FastAPI/BentoML/vLLM/LiteLLM), robust observability, and safe CI/CD on Docker/Kubernetes/ECS.

Operating Principles

  • Reproducibility first: versioned data/model/code with MLflow and immutable artifacts
  • Observability by design: metrics for drift, latency, errors; actionable alerts
  • Reliability & SLOs: degrade gracefully, fallback policies, shadow/canary releases
  • Performance & scale: adaptive batching, worker isolation, autoscaling on K8s/ECS
  • Cost efficiency: throughput per dollar, right‑sizing, caching and cold‑start control
  • Security & governance: access control, audit trails, policy‑driven deployments

Architecture Highlights

  • Model serving: FastAPI/BentoML services with separate ML workers and web workers for concurrency; autoscale via Kubernetes/ECS; adaptive batching for GPUs/CPUs
  • Experiment tracking: MLflow runs, artifacts, and metrics to ensure comparable, reproducible iterations across datasets and hyperparameters
  • Drift & performance monitoring: KS‑tests on match scores and response latency; thresholded alerts and rollback/traffic shifting when quality degrades
  • Storage & data: S3/MinIO object storage with deterministic preprocessing; schema validation and data quality gates
  • CI/CD: GitHub Actions building and testing containers; smoke tests, blue/green or canary strategies for safe rollouts
  • Eventing: MQTT where event‑driven triggers decouple retraining and inference pipelines
  • LLMOps: vLLM/LiteLLM for efficient token throughput; prompt/version management, caching, and cost/latency accounting

Experience Snapshots

  • DialFlow (Winner 2025): Gen‑AI voice agent using Twilio, FastAPI, Redis, ElevenLabs, and LangChain; streaming pipeline, prompt governance, and service KPIs for reliability and cost control
  • UM6P — DICEDATALAB: built forensic image feature‑matching service; template code for retrainable workflows; online drift monitoring on scores and latency with KS tests
  • 3D Smart Factory: NLP pipeline with S3 storage, SymSpell normalization, retraining loop; Flask + WSGI serving on AWS ECS
  • LR Consulting Maroc: confidential PDF‑to‑JSON API (Tabula/Pandas) with offline processing guarantees
  • Wikreate Agency: React + Laravel data visualization interface with pagination and search; robust API integration and testing

Technical Stack

1
2
3
4
5
6
7
8
9
10
11
stack = {
    "serving": ["FastAPI", "BentoML", "vLLM", "LiteLLM"],
    "orchestration": ["Docker", "Docker Compose", "Kubernetes"],
    "tracking": ["MLflow", "DVC"],
    "monitoring": ["custom KPIs", "drift metrics (KS)", "SLO dashboards"],
    "infrastructure": ["AWS ECS/S3", "GCP", "Docker", "Kubernetes"],
    "messaging": ["MQTT"],
    "data_storage": ["S3/MinIO", "PostgreSQL", "MongoDB", "Redis"],
    "automation": ["CI/CD", "GitHub Actions", "Git"],
    "fundamentals": ["Python core", "Adaptive batching", "Drift detection", "System design"]
}

Writing

I write about practical ML system challenges with math and implementation:

  • Time series analysis (ACF/PACF, Kalman filtering, spectral methods)
  • Data drift (covariate shift, prior probability shift, sample selection bias)
  • Inference optimization (batching strategies, processing patterns)
  • CNNs and feature engineering in production contexts

Contact

GitHub: @IbLahlou
LinkedIn: ibrahimlahlou-ild01
Kaggle: ibrahimld01
Twitter/X: @ILoDo01
Email: ibrahimlahlou021@gmail.com