Open to opportunities · Hyderabad, India

Telugu Doddi
Suresh

Senior Data Engineer · Azure · PySpark · Databricks

Results-driven Data Engineer with 4+ years building cloud-native data platforms on Azure. Specialist in Lakehouse architecture, Delta Lake pipelines, and PySpark transformations that power real business intelligence at scale.

Azure Lakehouse Architecture
MongoDB · PostgreSQL · Azure Blob
↓ ingest via Azure Data Factory
Azure Databricks · PySpark
↓ transform, validate & enrich
Delta Lake — 🥉 Bronze → 🥈 Silver → 🥇 Gold
↓ serve to consumers
Power BI · ML Models · Analytics
50%
Cost Saved
Perf Gain
70%
Fewer Errors
4+
Years Experience
50%
Cost Reduction
70%
Fewer Errors
Performance Gain
About Me

Azure-native.
Data-obsessed.

I'm a Senior Data Engineer specialising in cloud-native solutions built on Azure. My work spans end-to-end pipeline architecture — from ingesting raw data out of MongoDB, PostgreSQL, and Azure Blob Storage through to Gold-layer Delta Lake models powering executive dashboards.

I care deeply about data quality, governance, and reliability. Using Unity Catalog and Microsoft Purview, I build systems where stakeholders can trust every number they see.

Currently at Deloitte Consulting (USI), after impactful senior roles at LTIMindTree and HashStack Solutions — delivering measurable improvements at each step.

Core Focus Areas
Lakehouse ArchitectureETL / ELTDelta LakeData GovernancePerformance OptimizationCloud-Native AzurePySparkDatabricks
Education
B.Tech — Computer Science & Engineering
Malla Reddy Institute of Technology · 2017 – 2021
Working Style
Agile / ScrumDimensional ModelingData LineageCI/CD
Tech Stack

Built with the right tools.

Technologies I use daily to architect, build, and operate production-grade Azure data platforms.

Processing
Distributed Computing
PySparkApache SparkSpark SQLAzure Databricks
🔄
Orchestration
Pipeline Orchestration
Azure Data FactoryAzure PipelinesWorkflow Triggers
🏛️
Architecture
Data Architecture
Delta LakeLakehouseBronze/Silver/GoldETL/ELT
🗄️
Databases
Data Storage
PostgreSQLMongoDBAzure Blob StorageNoSQL
🛡️
Governance
Data Governance
Unity CatalogMicrosoft PurviewData LineageData Quality
🐍
Languages
Programming
PythonSQLPandasNumPy
📊
Visualization
Reporting & BI
Power BIDashboardsEDA
🔬
Specialised
Advanced Skills
OCR / OpenCVTesseractFeature EngineeringDimensional Modeling
Experience

Where I've built things.

A track record of delivering scalable data infrastructure across product, consulting, and enterprise environments.

March 2026 – Present
Consultant — Python Data Engineer
Deloitte Consulting India Pvt. Ltd. (USI) · Hyderabad
  • Designing and delivering cloud-native data solutions for enterprise clients in a consulting capacity
  • Applying Python, PySpark, and the full Azure stack to solve complex data engineering challenges at scale
September 2025 – February 2026
Senior Data Engineer
LTIMindTree Limited · Hyderabad
  • Delivered senior-level data engineering across complex enterprise data platform initiatives
  • Drove pipeline performance improvements and cross-team architecture reviews
October 2021 – August 2025
Data Engineer
HashStack Solutions Pvt. Ltd. (formerly Zitisi Solutions LLP) · Hyderabad
  • Designed and optimized scalable ETL/ELT pipelines using PySpark and Spark SQL in Azure Databricks, processing high-volume behavioural and transactional data from MongoDB
  • Implemented Delta Lake architecture (Bronze → Silver → Gold) with schema evolution and time travel for reliable batch and incremental data loads
  • Applied Spark SQL optimizations — broadcast joins, predicate pushdown, caching — reducing compute cost and improving pipeline execution by 50%
  • Designed scalable data models and partitioning strategies in Databricks, delivering 20% better query performance and 50% reduction in operating costs
  • Developed custom Spark-based data quality checks, detecting anomalies early and reducing transformation errors by over 70%
  • Migrated legacy pipelines from PostgreSQL to Apache Spark in Databricks, achieving significant scalability and throughput gains
  • Built Python automation using Tesseract OCR and OpenCV for intelligent image extraction from PDFs, boosting throughput by 75% via multi-threading
  • Developed a Python billing automation app with PostgreSQL backend and WhatsApp invoice delivery, cutting manual effort by 60%
Projects

Work that ships and scales.

Production data engineering projects spanning Lakehouse architecture, OCR automation, and pipeline optimisation.

🧠
MayaMaya — AI-Powered Career Platform
End-to-end Lakehouse platform processing psychometric, behavioural, and transactional data via Azure Databricks. Implemented Delta Lake (Bronze→Silver→Gold), Spark SQL optimisations, and ADF orchestration for continuous ML-ready data refresh.
PySparkAzure DatabricksDelta LakeADFMongoDBSpark SQL
50% cost reduction · 20% faster queries · 70% fewer errors
🔍
CMA — Intelligent Image Data Extraction
High-performance Python pipeline extracting structured data from images inside PDFs using OCR and computer vision. Multi-threaded architecture for parallel processing with PostgreSQL output.
PythonTesseract OCROpenCVMulti-threadingPostgreSQLPyPDF2
75% boost in processing throughput
🏗️
ETL Pipeline Automation — ADF & Databricks
Production ETL pipelines using Azure Data Factory for ingestion from SQL and Azure Blob, with complex PySpark transformations stored in Delta Lake for ACID-compliant, versioned analytics.
Azure Data FactoryDatabricksPySparkDelta LakeAzure Blob
80% reduction in manual intervention
📈
SQL Optimisation & Reporting Automation
Optimised large-scale SQL queries through index tuning, query refactoring, and execution plan analysis. Automated end-to-end reporting via Python scripts and stored procedures.
SQLPythonStored ProceduresIndex Tuning
50% improvement in report performance
🧾
Dhenusya Organics — Billing Automation
Python billing application automating invoice generation, inventory management, and reminders. PostgreSQL backend with WhatsApp invoice delivery via Pywhatkit.
PythonPostgreSQLPywhatkitPandas
60% reduction in manual billing effort
Key Achievements

Impact by the numbers.

Quantified outcomes from production data engineering work across platforms and industries.

🏆
Reduced costs by 50% and improved system performance by 20% through optimised data modeling and SQL tuning
Migrated legacy PostgreSQL pipelines to Azure Databricks, achieving 3× performance gains and horizontal scalability
🛡️
Improved data quality via robust validation frameworks, achieving a 70% reduction in transformation errors
📄
Built OCR-driven extraction using Python + Tesseract + OpenCV, increasing throughput by 75%
🤖
Automated invoice management for Dhenusya Organics, reducing manual effort by 60%
☁️
Ensured high data availability via proactive monitoring and Azure Data Factory orchestration
🔧
Tuned Spark configs achieving 30% faster SQL queries and 50% lower data load latency
📐
Delivered end-to-end data governance and lineage using Unity Catalog and Microsoft Purview
Get In Touch

Let's build
something great.

Whether it's a data architecture challenge, a consulting engagement, or just a conversation about Lakehouse design — I'm always happy to connect.

✉️
Email
telugusuresh.1998@gmail.com
📱
Phone
+91-9652358728
💼
LinkedIn
Telugu Doddi Suresh
🐙
GitHub
github.com/telugusuresh
📄
Resume
Download PDF Resume
Send a quick message