Summary

Overview

Work History

Education

Skills

Websites

Certification

WEBSITES, PORTFOLIOS AND PROFILES

Timeline

Ashwen Kumar

Lead Data Engineer

Kuala Lumpur

Summary

Experienced data engineer with over 10 years of expertise in developing innovative solutions for data lake and data warehousing projects. Skilled in utilizing the Big data stack and Azure cloud Platform to build efficient data pipelines within enterprise data management platforms. Open for relocation to engage with and gain knowledge from industry experts worldwide.

Overview

years of professional experience

years of post-secondary education

Certifications

Language

Work History

Technical Manager - ET Data Platforms

Standard Chartered Global Business Services

12.2024 - Current

Leading a Level 3 team comprising 8 members from Malaysia and India locations. Conducted daily scrum calls to review ongoing tasks and discuss sprint progress.

Developed shell scripts to validate incoming files before loading into the Data Lake Managed file decryption using Asymmetric keys for files encrypted using CAAS encryption framework
Resolved schema inconsistencies in HDFS ORC files post-framework migration through quick development of ad hoc Spark jobs.
Facilitated the successful migration of initial data levels from on-premises to Azure cloud Platform and successful completion of phase 1 of deploying jobs in Azure Platform using Databricks , Azure Data Factory and minIO . Leveraged on delta tables for storing data and for schema evolution.
Successfully led and completed 20+ deployments efficiently and supporting applications post deployment
Project Name : Enterprise Data Management Platform
Tools Used : Spark-2.3.0,SCALA,Hive,Unix,Control M, Azure Databricks, Kafka, Airflow, Data Lake Gen 2 , MinIO, Azure Devops

Senior Hadoop Developer

Accord Innovations SDN BHD

07.2019 - 08.2021

Spearheaded the project as the inaugural team member, receiving transition and business requirements from the onsite Singapore Team. Played a pivotal role in recruiting and assembling a proficient team for managing monthly sprints.
Managed the entire lifecycle of data pipelines, orchestrating data from diverse source systems to the big data platform using Sqoop. Implemented SCD2 transformations using Py-Spark to facilitate loading into the final layer of the Hadoop Layer.
Collaborated closely with the framework team to address various issues encountered during data ingestion. Developed scripts for data loading and utilized Python scripts to populate configurations into MySQL tables.
Constructed Hive views by leveraging join operations on different tables, aligning with business logic to grant business users access to data in the exploration zone.
Significantly optimized production job runtimes for loading high volumes of transactional data by fine-tuning Spark memory parameters and mitigating data skew.
Project Name : MYSI ( Malaysia Source Ingestion Platform)
Tools Used : Spark-2.3.0,Python,MySQL,Hive,Unix,Autosys

Senior Analyst

Standard Chartered Global services

12.2017 - 06.2019

Implemented Hive tables mirroring the structure of source tables across all layers of the Hadoop environment, ensuring data consistency and accessibility.
Developed Unix shell scripts to preprocess data stored in the NAS path before its ingestion into the Hadoop landing zone, enhancing data quality and efficiency.
Engineered scripts to manage data retention based on defined policies, ensuring compliance and efficient storage utilization within the Hadoop ecosystem.
Designed Control-M scripts for orchestrating job scheduling in production environments, ensuring seamless execution of jobs. Managed to automate batch monitoring efficiently by creating reports that save time and manual work
Project Name : Enterprise Data Management Platform
Tools Used : Hadoop- MR , Scala , Hive , Unix , Control M , Jenkins

Software Engineer

Ericsson India Pvt Ltd.

10.2016 - 11.2017

The Charge Reporting System (CRS) project aims to facilitate the loading of processed CDRs (Call Detail Records) into HBase tables. Originating from various telecom nodes, the input CDRs are initially encoded in ASN1-BER format. These CDRs undergo decoding via an internal tool known as multi-mediation, subsequently transitioning into CSV files for further processing.
Involved in designing the data model for HBase tables, ensuring efficient storage and retrieval of data. Designed appropriate row keys for newly deployed tables, optimizing data access patterns.
Involved in designing the data model of HBase table and row key for any new table that needs to be deployed.
Employed the Flow Engine, a robust ETL tool akin to NIFI, for streamlined data loading and transformation processes, enhancing efficiency and scalability.
Project Name : Charge Reporting System (CRS)
Tools Used : HBase, Unix ,Flow Engine, Git, Jenkins

Associate Software Professional

Computer Science corporation

07.2014 - 10.2016

The Pharmacy Information System (PIMS) project focuses on migrating the existing data warehouse from TANDEM-SQL to Hadoop. A new data warehouse is designed in Hive, catering to the requirements of the ONLINE FOREX team for generating diverse reports related to drug consumption.
Developed Sqoop scripts for seamless data import from the Tandem-SQL database to the Big Data cluster, ensuring data integrity and consistency throughout the migration process.
Created scripts for scheduling jobs to run on a regular basis, leveraging tools such as Oozie for job orchestration and automation, ensuring timely data processing and availability.
Conducted performance testing to enhance data import times, optimizing data processing efficiency and throughput. Implemented optimizations based on performance test results to ensure optimal system performance.
Analyzed source system tables to identify efficient columns for achieving parallelism and determining the optimal number of mappers, enhancing data processing speed and resource utilization.
Project Name : Pharmacy Information System (PIMS)
Tools Used : Hadoop,Hive, Unix ,Tandem-SQL, Oozie

Education

Bachelor of Technology - Information Technology (B.Tech - IT)

Anna University

India

01.2010 - 01.2014

Skills

Hadoop

Hive

Sqoop

Spark

Kafka

Hbase

Azure Cosmos DB

Azure Data Factory

Apache Nifi

Docker

Scala

Python

Azure Data Bricks

Data Lake

Azure SQL

Unix

Dimension

Azure Devops

Jenkins

Ansible

Websites

www.linkedin.com/in/ashwen-kumar-3600a937

Certification

Azure Fundamentals (AZ900)

WEBSITES, PORTFOLIOS AND PROFILES

www.linkedin.com/in/ashwen-kumar-3600a937

Timeline

Technical Manager - ET Data Platforms

Standard Chartered Global Business Services

12.2024 - Current

Senior Hadoop Developer

Accord Innovations SDN BHD

07.2019 - 08.2021

Senior Analyst

Standard Chartered Global services

12.2017 - 06.2019

Software Engineer

Ericsson India Pvt Ltd.

10.2016 - 11.2017

Associate Software Professional

Computer Science corporation

07.2014 - 10.2016

Bachelor of Technology - Information Technology (B.Tech - IT)

Anna University

01.2010 - 01.2014

Ashwen Kumar

Summary

Overview

Work History

Technical Manager - ET Data Platforms

Senior Hadoop Developer

Senior Analyst

Software Engineer

Associate Software Professional

Education

Bachelor of Technology - Information Technology (B.Tech - IT)

Skills

Websites

Certification

WEBSITES, PORTFOLIOS AND PROFILES

Timeline

Technical Manager - ET Data Platforms

Senior Hadoop Developer

Senior Analyst

Software Engineer

Associate Software Professional

Bachelor of Technology - Information Technology (B.Tech - IT)

Similar Profiles

Anirudha KuppalliAnirudha Kuppalli

Ashish Kumar BiyahutAshish Kumar Biyahut

Deeksha ShandilyaDeeksha Shandilya

Su Ching Ester KWONGSu Ching Ester KWONG

Jack GooJack Goo