Open to opportunities · Hyderabad, India

Telugu Doddi
Suresh

Senior Data Engineer · Azure · PySpark · Databricks

Results-driven Data Engineer with 4+ years building cloud-native data platforms on Azure. Specialist in Lakehouse architecture, Delta Lake pipelines, and PySpark transformations that power real business intelligence at scale.

View My Work Let's Connect Download CV

✉️ Email 💼 LinkedIn 🐙 GitHub

Azure Lakehouse Architecture

MongoDB · PostgreSQL · Azure Blob

↓ ingest via Azure Data Factory

Azure Databricks · PySpark

↓ transform, validate & enrich

Delta Lake — 🥉 Bronze → 🥈 Silver → 🥇 Gold

↓ serve to consumers

Power BI · ML Models · Analytics

50%

Cost Saved

3×

Perf Gain

70%

Fewer Errors

About Me

Azure-native.
Data-obsessed.

I'm a Senior Data Engineer specialising in cloud-native solutions built on Azure. My work spans end-to-end pipeline architecture — from ingesting raw data out of MongoDB, PostgreSQL, and Azure Blob Storage through to Gold-layer Delta Lake models powering executive dashboards.

I care deeply about data quality, governance, and reliability. Using Unity Catalog and Microsoft Purview, I build systems where stakeholders can trust every number they see.

Currently at Deloitte Consulting (USI), after impactful senior roles at LTIMindTree and HashStack Solutions — delivering measurable improvements at each step.

Core Focus Areas

Lakehouse ArchitectureETL / ELTDelta LakeData GovernancePerformance OptimizationCloud-Native AzurePySparkDatabricks

Education

B.Tech — Computer Science & Engineering

Malla Reddy Institute of Technology · 2017 – 2021

Working Style

Agile / ScrumDimensional ModelingData LineageCI/CD

Tech Stack

Built with the right tools.

Technologies I use daily to architect, build, and operate production-grade Azure data platforms.

⚡

Processing

Distributed Computing

PySparkApache SparkSpark SQLAzure Databricks

🔄

Orchestration

Pipeline Orchestration

Azure Data FactoryAzure PipelinesWorkflow Triggers

🏛️

Architecture

Data Architecture

Delta LakeLakehouseBronze/Silver/GoldETL/ELT

🗄️

Databases

Data Storage

PostgreSQLMongoDBAzure Blob StorageNoSQL

🛡️

Governance

Data Governance

Unity CatalogMicrosoft PurviewData LineageData Quality

🐍

Languages

Programming

PythonSQLPandasNumPy

📊

Visualization

Reporting & BI

Power BIDashboardsEDA

🔬

Specialised

Advanced Skills

OCR / OpenCVTesseractFeature EngineeringDimensional Modeling

Experience

Where I've built things.

A track record of delivering scalable data infrastructure across product, consulting, and enterprise environments.

March 2026 – Present

Consultant — Python Data Engineer

Deloitte Consulting India Pvt. Ltd. (USI) · Hyderabad

Designing and delivering cloud-native data solutions for enterprise clients in a consulting capacity
Applying Python, PySpark, and the full Azure stack to solve complex data engineering challenges at scale

September 2025 – February 2026

Senior Data Engineer

LTIMindTree Limited · Hyderabad

Delivered senior-level data engineering across complex enterprise data platform initiatives
Drove pipeline performance improvements and cross-team architecture reviews

October 2021 – August 2025

Data Engineer

HashStack Solutions Pvt. Ltd. (formerly Zitisi Solutions LLP) · Hyderabad

Designed and optimized scalable ETL/ELT pipelines using PySpark and Spark SQL in Azure Databricks, processing high-volume behavioural and transactional data from MongoDB
Implemented Delta Lake architecture (Bronze → Silver → Gold) with schema evolution and time travel for reliable batch and incremental data loads
Applied Spark SQL optimizations — broadcast joins, predicate pushdown, caching — reducing compute cost and improving pipeline execution by 50%
Designed scalable data models and partitioning strategies in Databricks, delivering 20% better query performance and 50% reduction in operating costs
Developed custom Spark-based data quality checks, detecting anomalies early and reducing transformation errors by over 70%
Migrated legacy pipelines from PostgreSQL to Apache Spark in Databricks, achieving significant scalability and throughput gains
Built Python automation using Tesseract OCR and OpenCV for intelligent image extraction from PDFs, boosting throughput by 75% via multi-threading
Developed a Python billing automation app with PostgreSQL backend and WhatsApp invoice delivery, cutting manual effort by 60%

Projects

Work that ships and scales.

Production data engineering projects spanning Lakehouse architecture, OCR automation, and pipeline optimisation.

🧠

MayaMaya — AI-Powered Career Platform

End-to-end Lakehouse platform processing psychometric, behavioural, and transactional data via Azure Databricks. Implemented Delta Lake (Bronze→Silver→Gold), Spark SQL optimisations, and ADF orchestration for continuous ML-ready data refresh.

PySparkAzure DatabricksDelta LakeADFMongoDBSpark SQL

50% cost reduction · 20% faster queries · 70% fewer errors

🔍

CMA — Intelligent Image Data Extraction

High-performance Python pipeline extracting structured data from images inside PDFs using OCR and computer vision. Multi-threaded architecture for parallel processing with PostgreSQL output.

PythonTesseract OCROpenCVMulti-threadingPostgreSQLPyPDF2

75% boost in processing throughput

🏗️

ETL Pipeline Automation — ADF & Databricks

Production ETL pipelines using Azure Data Factory for ingestion from SQL and Azure Blob, with complex PySpark transformations stored in Delta Lake for ACID-compliant, versioned analytics.

Azure Data FactoryDatabricksPySparkDelta LakeAzure Blob

80% reduction in manual intervention

📈

SQL Optimisation & Reporting Automation

Optimised large-scale SQL queries through index tuning, query refactoring, and execution plan analysis. Automated end-to-end reporting via Python scripts and stored procedures.

SQLPythonStored ProceduresIndex Tuning

50% improvement in report performance

🧾

Dhenusya Organics — Billing Automation

Python billing application automating invoice generation, inventory management, and reminders. PostgreSQL backend with WhatsApp invoice delivery via Pywhatkit.

PythonPostgreSQLPywhatkitPandas

60% reduction in manual billing effort

Key Achievements

Impact by the numbers.

Quantified outcomes from production data engineering work across platforms and industries.

🏆

Reduced costs by 50% and improved system performance by 20% through optimised data modeling and SQL tuning

⚡

Migrated legacy PostgreSQL pipelines to Azure Databricks, achieving 3× performance gains and horizontal scalability

🛡️

Improved data quality via robust validation frameworks, achieving a 70% reduction in transformation errors

📄

Built OCR-driven extraction using Python + Tesseract + OpenCV, increasing throughput by 75%

🤖

Automated invoice management for Dhenusya Organics, reducing manual effort by 60%

☁️

Ensured high data availability via proactive monitoring and Azure Data Factory orchestration

🔧

Tuned Spark configs achieving 30% faster SQL queries and 50% lower data load latency

📐

Delivered end-to-end data governance and lineage using Unity Catalog and Microsoft Purview

Get In Touch

Let's build
something great.

Whether it's a data architecture challenge, a consulting engagement, or just a conversation about Lakehouse design — I'm always happy to connect.

✉️

telugusuresh.1998@gmail.com

github.com/telugusuresh

📄

Resume

Download PDF Resume

Send a quick message

Telugu DoddiSuresh

Azure-native.Data-obsessed.