Summary
Overview
Work History
Education
Skills
Websites
Certification
WEBSITES, PORTFOLIOS AND PROFILES
Timeline
Generic

Ashwen Kumar

Lead Data Engineer
Kuala Lumpur

Summary

Experienced data engineer with over 10 years of expertise in developing innovative solutions for data lake and data warehousing projects. Skilled in utilizing the Big data stack and Azure cloud Platform to build efficient data pipelines within enterprise data management platforms. Open for relocation to engage with and gain knowledge from industry experts worldwide.

Overview

11
11
years of professional experience
4
4
years of post-secondary education
2
2
Certifications
1
1
Language

Work History

Technical Manager - ET Data Platforms

Standard Chartered Global Business Services
12.2024 - Current

Leading a Level 3 team comprising 8 members from Malaysia and India locations. Conducted daily scrum calls to review ongoing tasks and discuss sprint progress.

  • Developed shell scripts to validate incoming files before loading into the Data Lake Managed file decryption using Asymmetric keys for files encrypted using CAAS encryption framework
  • Resolved schema inconsistencies in HDFS ORC files post-framework migration through quick development of ad hoc Spark jobs.
  • Facilitated the successful migration of initial data levels from on-premises to Azure cloud Platform and successful completion of phase 1 of deploying jobs in Azure Platform using Databricks , Azure Data Factory and minIO . Leveraged on delta tables for storing data and for schema evolution.
  • Successfully led and completed 20+ deployments efficiently and supporting applications post deployment
  • Project Name : Enterprise Data Management Platform
  • Tools Used : Spark-2.3.0,SCALA,Hive,Unix,Control M, Azure Databricks, Kafka, Airflow, Data Lake Gen 2 , MinIO, Azure Devops

Senior Hadoop Developer

Accord Innovations SDN BHD
07.2019 - 08.2021
  • Spearheaded the project as the inaugural team member, receiving transition and business requirements from the onsite Singapore Team. Played a pivotal role in recruiting and assembling a proficient team for managing monthly sprints.
  • Managed the entire lifecycle of data pipelines, orchestrating data from diverse source systems to the big data platform using Sqoop. Implemented SCD2 transformations using Py-Spark to facilitate loading into the final layer of the Hadoop Layer.
  • Collaborated closely with the framework team to address various issues encountered during data ingestion. Developed scripts for data loading and utilized Python scripts to populate configurations into MySQL tables.
  • Constructed Hive views by leveraging join operations on different tables, aligning with business logic to grant business users access to data in the exploration zone.
  • Significantly optimized production job runtimes for loading high volumes of transactional data by fine-tuning Spark memory parameters and mitigating data skew.
  • Project Name : MYSI ( Malaysia Source Ingestion Platform)
  • Tools Used : Spark-2.3.0,Python,MySQL,Hive,Unix,Autosys

Senior Analyst

Standard Chartered Global services
12.2017 - 06.2019
  • Implemented Hive tables mirroring the structure of source tables across all layers of the Hadoop environment, ensuring data consistency and accessibility.
  • Developed Unix shell scripts to preprocess data stored in the NAS path before its ingestion into the Hadoop landing zone, enhancing data quality and efficiency.
  • Engineered scripts to manage data retention based on defined policies, ensuring compliance and efficient storage utilization within the Hadoop ecosystem.
  • Designed Control-M scripts for orchestrating job scheduling in production environments, ensuring seamless execution of jobs. Managed to automate batch monitoring efficiently by creating reports that save time and manual work
  • Project Name : Enterprise Data Management Platform
  • Tools Used : Hadoop- MR , Scala , Hive , Unix , Control M , Jenkins

Software Engineer

Ericsson India Pvt Ltd.
10.2016 - 11.2017
  • The Charge Reporting System (CRS) project aims to facilitate the loading of processed CDRs (Call Detail Records) into HBase tables. Originating from various telecom nodes, the input CDRs are initially encoded in ASN1-BER format. These CDRs undergo decoding via an internal tool known as multi-mediation, subsequently transitioning into CSV files for further processing.
  • Involved in designing the data model for HBase tables, ensuring efficient storage and retrieval of data. Designed appropriate row keys for newly deployed tables, optimizing data access patterns.
  • Involved in designing the data model of HBase table and row key for any new table that needs to be deployed.
  • Employed the Flow Engine, a robust ETL tool akin to NIFI, for streamlined data loading and transformation processes, enhancing efficiency and scalability.
  • Project Name : Charge Reporting System (CRS)
  • Tools Used : HBase, Unix ,Flow Engine, Git, Jenkins

Associate Software Professional

Computer Science corporation
07.2014 - 10.2016
  • The Pharmacy Information System (PIMS) project focuses on migrating the existing data warehouse from TANDEM-SQL to Hadoop. A new data warehouse is designed in Hive, catering to the requirements of the ONLINE FOREX team for generating diverse reports related to drug consumption.
  • Developed Sqoop scripts for seamless data import from the Tandem-SQL database to the Big Data cluster, ensuring data integrity and consistency throughout the migration process.
  • Created scripts for scheduling jobs to run on a regular basis, leveraging tools such as Oozie for job orchestration and automation, ensuring timely data processing and availability.
  • Conducted performance testing to enhance data import times, optimizing data processing efficiency and throughput. Implemented optimizations based on performance test results to ensure optimal system performance.
  • Analyzed source system tables to identify efficient columns for achieving parallelism and determining the optimal number of mappers, enhancing data processing speed and resource utilization.
  • Project Name : Pharmacy Information System (PIMS)
  • Tools Used : Hadoop,Hive, Unix ,Tandem-SQL, Oozie

Education

Bachelor of Technology - Information Technology (B.Tech - IT)

Anna University
India
01.2010 - 01.2014

Skills

  • Hadoop

  • Hive

  • Sqoop

  • Spark

  • Kafka

  • Hbase

  • Azure Cosmos DB

  • Azure Data Factory

  • Apache Nifi

  • DBT

  • Docker

  • Scala

  • Python

  • Azure Data Bricks

  • Data Lake

  • Azure SQL

  • Unix

  • GIT

  • Dimension

  • Azure Devops

  • Jenkins

  • Ansible

Certification

Azure Fundamentals (AZ900)

WEBSITES, PORTFOLIOS AND PROFILES

www.linkedin.com/in/ashwen-kumar-3600a937

Timeline

Technical Manager - ET Data Platforms

Standard Chartered Global Business Services
12.2024 - Current

Senior Hadoop Developer

Accord Innovations SDN BHD
07.2019 - 08.2021

Senior Analyst

Standard Chartered Global services
12.2017 - 06.2019

Software Engineer

Ericsson India Pvt Ltd.
10.2016 - 11.2017

Associate Software Professional

Computer Science corporation
07.2014 - 10.2016

Bachelor of Technology - Information Technology (B.Tech - IT)

Anna University
01.2010 - 01.2014
Ashwen KumarLead Data Engineer