AI Engineer · LLM Systems · Python Backend
I build AI systems that ship and hold under pressure.
Agentic architectures (LangGraph, MCP), production RAG pipelines, and observable Python backends — instrumented end-to-end so failures are never silent.
- 120K+Documents processed via RAG
- 8Production endpoints shipped
- 99.5%Uptime under concurrent load
About
Engineering trust into AI systems.
I work at the intersection of agentic architecture, RAG pipelines, and production Python backend engineering. At Avaxia, I deployed multi-step LLM workflows using MCP and LangGraph across enterprise clients, and instrumented them end-to-end with Langfuse and LangSmith — because in production, failures should never be silent.
Earlier, at RegimLab, I built EEG-based clinical AI that reached 92% ADHD classification accuracy, layered with Explainable AI (SHAP, LIME, Integrated Gradients) and validated by domain clinicians. In healthcare, a model that can't explain itself is a model that can't be trusted.
I care about systems that are observable, testable, and honest about their failure modes.
What I work with
Skills & Stack
LLM & Agentic Systems
The core of my work — building reliable agentic pipelines that ship to production, not just demos.
- LangGraph
- MCP
- A2A
- LangChain
- LlamaIndex
- OpenAI API
- Claude API
- Ollama
- Prompt Engineering
- Tool Use
- Context Management
RAG & Retrieval
- RAG Pipelines
- Milvus
- Semantic Chunking
- Embeddings
- Hybrid Search
- Re-ranking
- LangSmith
- Langfuse
- RAGAS
- n8n
Backend & APIs
- FastAPI
- Flask
- asyncio
- Celery
- Redis
- REST
- PostgreSQL
- MongoDB
- Pydantic
- WebSockets
MLOps & Observability
- Docker
- Kubernetes
- ArgoCD
- MLflow
- GitHub Actions
- Azure
- AWS
- Selenium
- CI/CD for ML
- Model Versioning
Research & XAI
- SHAP
- LIME
- Integrated Gradients
- PyTorch
- EEGNet
- CNN / LSTM
- Time-Series
- Hugging Face
- LoRA / PEFT
- Quantization
- Python
- Java
- C++
- SQL
- Bash
- Arabic (Native)
- English (Fluent)
- French (Fluent)
Where I've worked
Experience
AI Backend Developer
- Architected and shipped production-grade agentic AI systems using LangGraph and MCP, orchestrating multi-step tool-calling workflows across 2 enterprise clients and processing 120K+ documents through semantic RAG pipelines — lifting retrieval precision from 61% → 84% (RAGAS context precision).
- Engineered 8 production RESTful endpoints with FastAPI and async Python exposing LLM inference, RAG retrieval, and conversational AI services — sustaining 99.5% uptime and a 38ms median response time under concurrent load.
- Cut API p99 latency by 76% (2.8s → 670ms) via cProfile-driven optimization, asyncio refactoring, and Redis TTL tuning — scaling throughput from 120 to 520 RPM without infrastructure changes.
- Owned end-to-end delivery of 4 production AI features across 8 months, translating requirements from 2 cross-functional teams into deployable services on Azure with ArgoCD-managed GitOps pipelines.
- Drove code quality through structured PR reviews and a pytest + Selenium suite over 40+ critical paths — reaching 78% coverage and cutting production incidents by 35%.
- Python
- FastAPI
- LangGraph
- MCP
- LangChain
- LlamaIndex
- OpenAI
- Claude
- Mixtral
- Ollama
- Redis
- Celery
- Milvus
- Langfuse
- LangSmith
- n8n
- Azure
- ArgoCD
- Docker
- Selenium
Machine Learning Researcher · Intern
- Designed a clinical-grade ADHD detection system on EEG time-series from 180 subjects, combining CNN, LSTM, and EEGNet with Explainable AI overlays — the lab's first XAI-augmented diagnostic pipeline validated by medical reviewers.
- Engineered a signal processing pipeline (VMD + ICA) to clean 64-channel EEG recordings — compressing raw feature space from 320+ dimensions to 28 informative biomarkers while preserving clinically relevant frequency bands.
- Reached 92% accuracy (F1: 0.91) — beating the prior lab baseline by +6 pp — via a CNN/LSTM/EEGNet ensemble with PCA + RFE feature selection on multi-band EEG spectrograms.
- Elevated clinical trust through SHAP, LIME, and Integrated Gradients overlays generating per-prediction biomarker attribution maps — reducing physician review time by 22%, validated by 2 domain clinicians.
- Python
- PyTorch
- EEGNet
- CNN
- LSTM
- MNE
- SHAP
- LIME
- Integrated Gradients
- XGBoost
- PCA
- VMD
- ICA
Data Science Intern
- Delivered an internal NLP chatbot handling 60+ daily queries with integrated sentiment analysis — reaching 84% intent classification accuracy on a 12K-sample domain corpus via fine-tuned BERT.
- Compressed a BERT-based model from 340MB to 87MB using LoRA fine-tuning and INT8 quantization — a 2.2× inference speedup with under 1% accuracy loss, enabling deployment on memory-constrained instances.
- Containerized and deployed the full NLP inference stack on Kubernetes with zero-downtime rolling releases — 99.5% availability over 3 months, tracking 22 experiment runs in MLflow.
- Cut end-to-end chatbot response time by 30% (1.4s → 980ms) via Flask request batching, model-layer caching, and INT8 quantization — lifting post-launch user satisfaction by 14%.
- Python
- Flask
- BERT
- PyTorch
- Hugging Face
- LoRA
- INT8 Quantization
- Docker
- Kubernetes
- GitHub Actions
- MLflow
- AWS
Software Engineer Intern
- Launched a full-stack project tracking platform (MEAN stack) adopted by 3 internal teams and 22 users within 2 weeks — integrating OAuth 2.0 / JWT and real-time webhook notifications with zero security incidents post-launch.
- Designed a behavioral data pipeline capturing 8+ interaction event types at roughly 4K events/day — powering analytics dashboards and personalization models with Pandas and SQL.
- Built content-based and collaborative-filtering recommenders in Scikit-learn — a 12% lift in task click-through and 9% improvement in collaborator discovery over the non-personalized baseline.
- Ran 3 A/B experiments (p < 0.05) demonstrating an 8% reduction in project completion time — informing the product team's full rollout decision.
- MEAN Stack
- JWT
- OAuth 2.0
- Redux
- Python
- Scikit-learn
- Pandas
- SQL
- A/B Testing
- Recommenders
A closer look
Selected Work
Three projects from the work above — agentic AI in production, clinical XAI, and edge-deployable NLP.
Want to see more?
Visit my GitHubCredentials
Education & Certifications
-
2020 – 2023
Engineering Degree · Computer Science & Applied Mathematics
National Engineering School of Sfax (ENIS) · Sfax, Tunisia
-
2017 – 2020
Pre-engineering Cycle · Mathematics & Physics
Preparatory Institute of Engineering Studies, El Manar · Tunis, Tunisia
-
2016
Baccalaureate in Mathematics
High School · Tunis, Tunisia
Certifications
-
Huawei HCIA-AI Jun 2022 – Jun 2025
Languages
- ArabicNative
- EnglishFluent
- FrenchFluent
Get in touch
Let's build something together.
Open to AI Engineer / LLM Systems / Python Backend roles and consulting. I usually reply within 24 hours.