Results-driven Data Engineer with over 9 years of hands-on experience in data pipeline development, big data architecture, cloud infrastructure, and enterprise software solutions. Proven expertise in leading ETL processes, feature engineering, and model deployment on cloud platforms including GCP, Azure, and Alibaba Cloud.
• 6+ years of experience in Cloud & Big Data infrastructure, designing and deploying scalable data pipelines using tools like Azure Databricks, GCP Cloud Functions, and Kafka, with strong exposure to Hadoop ecosystem and Cloudera platforms
• Specialized in ETL development and orchestration using PySpark, with advanced skills in integrating structured/unstructured data from diverse sources for real-time and batch analytics
• Delivered impactful projects across industries including Agritech, Telecom, Banking, Energy, and Retail, working closely with Data Science teams to productionize ML models
• 5+ years of experience in DWH & Data Integration, including real-time Oracle CDC via Informatica PowerExchange, and downstream data management within large-scale banking systems
• 7+ years of software development experience using Oracle PL/SQL, SQL Server, .NET, and Crystal Reports, focusing on high-performance database systems, particularly in insurance and ERP environments
• Strong understanding of data governance, monitoring, alerting, and cloud-native architecture, with a consistent record of delivering high-impact solutions in complex, data-intensive environments
Energy & IoT Projects – Alibaba Cloud
Agricultural AI Projects – Google Cloud Platform (GCP)
• Partnered with Data Science team to deploy machine learning models into real-time data pipelines with full lifecycle feature engineering
• Swine Cough Detection System:
– Ingested .wav audio files into Blob Storage
– Triggered Cloud Functions to extract acoustic features
– Deployed model to classify whether input sound was a pig’s cough
• Chicken Mortality Forecast:
– Collected data via Pub/Sub, stored in GCS, and triggered Cloud Functions on a timer
– Performed tailored feature engineering to feed into the predictive model
– Output: Identified which farms/coops were at high risk of chicken mortality within 7 days
• Shrimp Feeding Alert System:
– Integrated RTSP cameras to capture real-time screenshots
– Used image recognition model to detect floating shrimp
– Triggered alerts to farm endpoints when overfeeding was detected
• Built ETL pipelines for national energy monitoring and smart alert systems
• Utility Billing Analytics Platform:
– Set up API Gateway for external partners to send data
– Ingested raw data into Log Service and transformed it
– Stored cleaned data into AnalyticDB
– Managed orchestration workflows for consistent ETL and frontend dashboard readiness
• Electrical Appliance Failure Alert System:
– Used Function Compute with timer trigger to sweep through ingested logs
– Applied business logic to detect failure patterns
– Automatically sent alerts via LINE API for rapid maintenance response
Retail Data Platform – Microsoft Azure
• Delivered full-stack data ingestion and reporting system for retail operations
• Set up Azure API Management to manage incoming backend data
• Ingested transactional data via Azure Function App into SQL Database
• Developed ETL pipelines triggered by timers to prepare data per business requirements
• Enabled self-service analytics by integrating Power BI with curated datasets
ETL development
Data pipeline design
Data warehousing
Data migration
SQL expertise
Big data processing
Performance tuning
API development
Spark framework
Data quality assurance
Real-time analytics
Hadoop ecosystem