SQL Streaming Module – Documentation & Evolutions
Contributions to an internal SQL-on-Flink streaming module at the core of Bouygues Telecom's ETL. Main deliverable: an automated documentation system and CI/CD integration to make the module production-ready.
Context
Bouygues Telecom's internal ETL includes a SQL streaming module built on top of Apache Flink. It exposes a SQL-derived language letting data engineers define streaming jobs (source → sink) without writing Flink code directly. The module was functional but not production-ready: connectors lacked documentation and the available options were invisible to users.
Approach
- ·Deep-dived into the internal ETL architecture and the streaming module internals to understand the full connector system
- ·Built a Maven Java plugin that parses the module's source code, extracts annotations, and auto-generates Markdown documentation
- ·Deployed a Docusaurus static site to expose the generated docs in a navigable, user-friendly format
- ·Integrated the documentation pipeline into the existing CI/CD chain
- ·Worked on streaming evolutions: error handling improvements and robustness to malformed data
Solution
An automated documentation system that keeps the streaming module's connector reference always in sync with the source code, deployed via CI/CD — giving data engineers a reliable, up-to-date reference to write streaming jobs autonomously.
Key outcome
The streaming module went from entirely undocumented to fully referenced, unblocking data engineers from having to inspect source code to understand available connectors and options.
Project details
Technologies
More projects
View all →YAML Configuration Editor
Custom internal web app that allows non-technical stakeholders at papernest to safely configure data pipelines. Cut pipeline build & test time by ~50% and incident investigation from 30–60 min to under 2 min.
Log Data Visualization Platform
Full-stack internal web application for log data visualization and analysis at Bouygues Telecom. Machine log investigation time reduced from several minutes to near-instant access. Designed and built independently from scratch.
EscobAddictions — Global Drug Data Analysis
Exploratory data analysis on global drug phenomena across four independent studies: mortality by substance, consumption by age group, cocaine–crime correlation, and drug-related imprisonment trends. Includes an interactive Power BI dashboard.