Work

Projects - Professional /Internships

At Tiger Analytics

MLOps Framework Dev for Unified DataScience Platform

Led a team of three ML Engineers in developing a Scalable MLOps Framework in Amazon Web Services utilizing Sagemaker capabilities.
Developed and deployed a MLOps framework for deploying Datascience models with ease involving services such as Sagemaker, DynamoDB, Event Bridge, SNS.
Developed a Organizational flow for the client to onboard new datascientists to get familiar with framework and reduce their model development and deployment time by more than 75 % (Almost from 4 Weeks to 7 Working Days)

MLOps Pipeline Dev for Price Elasticity

Led the GCP ML Accelerator Program and helped organization to become a GCP ML partner. My contribution involve explaining the projects and the organization capbility in terms of both Technical capabilities and the successfull projects and business impact.
Led a team of four ML Engineers in developing a Scalable and experiment reproducible MLOps pipeline for Price Elasticity of SKU's classification.
Able to reduce the total models training time by more than 50 percent using parallelised approaches with Pyspark.
End-to-End pipeline with orchestrator as Google Composer, Custom container deployment for Feature preparation, Dataproc for running Pyspark jobs to train more than 30000 SKU models in less than 3 hours.

IOT - CV -ML Exploration - Internal POC

Leading the capability development of the firm in the combined space of IOT and CV.
Developing scalable architecture on AWS with IOT Greengrass and Kinesis.
Currently in POC phase where we are trying to work with edge devices with two streams of data. One of the stream is a continuous value data (IOT sensors) and one other stream is a video stream.
Exploring the Scalable CV pipelines both in DS and MLOps perspective. Checking the feasibility of optimisation of the pipelines from the Data preprocessing of streams to Inference. (Elastic Inference, Nvidia Deepstream SDK, Triton Inference Server)

Login Event Anomaly Detection - ML in Cybersecurity

Developed a Feature Engineering and Ingestion Pipeline using MLOps approach
Developed a Feature Serving and Training pipeline using GCP Vertex AI MLOps
eveloped an Online Serving and Eval pipeline using GCP Vertex AI.
Automating the entire ML Dev and Deployment field using CI-CD with GCP services.
Developed models related to Anomaly detection using a hybrid approach of Unsupervised and Supervised models with an additional component of Explainable AI.

As a Freelancer:

Project based Freelancing - Client through LinkedIn

Point Tracking for Ultrasound Videos to measure joint instability
- Developed a end to end framework for aiding Medical practitioners to track different segments in ultrasound at a point level throughout the timeframe of a video
- This tool can provide real time plots of distance and angle between segments in 2D space and also gives the capability of doing further analysis with a single time run data.
Knee Arthroscopy Tool
- Developing a workflow with Tradtional and DL based CV to handle Image based relative measurements and transfer them to real world coordinates
- This tool helps in Surgeries as an aid to Surgeons to get relative real world estimates of the space on which they are performing the surgery
Label Studio Setup for Automated ML Labelling
- Developed Code to support Yolov5 as a ML Backend for Automated ML labelling in Label Studio
- Integrated Yolov5 , OpenMMLab Object Detection models to LabelStudio ML Backend
- Wrote Scripts to run LabelStudio in a Colab leveraging Colab’s GPU working through a tunneled external url using Ngrok.
Website for Video Analytics
- Developing an Website where user can upload videos and select or upload pretrained models (Yolo, CLIP based formats) and get the results streamed on the same website with outputs labelled on the Video.
- Used Streamlit for Basic frontend, Model Runner API on backend which runs scripts according to the selection on UI.
- Plans are there to upgrade this to a React based Website with more model or Data format supports.

Full-time Machine Learning Engineer- Deep Judge:

Document Text Extraction and Summarisation
- Developed an API that can extract text from a PDF and group the pages based on Page start and ending based on OpenCV-based logic, Block detection, and by tesseract.
- Used Hugging Face - T5 Transformer inference for Summarisation of the extracted text.
Exploration of NER (Named Entity Recognition) Models (SOTA) and datasets.
Information extraction from .docx (XML) formatted documents through traditional python libraries for further NER and NLP-based tagging pipeline.
Exploration on SPARQL and Semantic Web based approaches for graph creation for linking the content.

Part-time Data Science Consultant - Aays Analytics:

Retail Analytics Forecast :
- Developing a Proof of Concept Forecast model for Retail Fashion Products (Online mode of selling) to help the planning team about stock intakes, the product should be restocked or not.
- Experimenting with multiple algorithms and problem statements is solved by modeling it as a Classification + Regression Problem.

Part- time Project - Applied Computing

Auto Ticket Generation POC:
- Working on developing a platform to automate ticket generation for queries and complaints
- End to End platform to automate the Phone calls/messages received by a contact center and use AI for automatic ticket generation
Virtual KYC:
- Developed a full-fledged Virtual KYC solution with OpenCV, AWS Textract, and Facerec APIs to extract information, verify faces and textual information, and use it for the KYC authentication system. Here the solution provides the total information extracted plus whether the documents are acceptable or not or any discrepancies in the documents.

Part-time Project- News to WordPress

Multi languages news translator

Developed a Fast API + Streamlit based Web interface which can take an e-news or normal paper image picture and process it using Vision API/ Open source-based Pytesseract to extract information (Block level), detect language, and translate it to a set of languages according to the client specification and publish it to different WordPress domains as a new Post.

Client through Connections:

Reinforcement learning-based trading system:
- Worked on developing codebase to develop a base Reinforcement learning-based trading framework taking the help of opensource work. Especially integration Stable Baselines3 to the OpenAI Gym framework. Worked on A2C, PPO, DQN, DDPG integration to the current system using the CNN and MLP based feature network(Current Default support).

Upwork (Client Details hidden) :

Q&A Pipeline:
- Developing base question and answering pipeline for demonstration and tutorial purpose using Hugging Face transformers, Pytorch lightning, and model metadata logging to Weights and Biases.

At New Space Research & Technologies

Computer Vision Pipeline for Targeting and Navigation:
- Worked on developing the current feature matching algorithms (Keypoint extraction algorithms) such as SIFT, SURF which belong to more of the traditional Computer Vision model.
- So, to increase their accuracy and generalization worked on Deep learning-based feature matching and SLAM-based algorithms such as D2 Net, SuperGlue, and LoFTR which are mostly related to CNN-based or CNN+GNN based approaches.
Platform Shift - Raspberry Pi to Jetson NX:
- Lead the exploration of moving the current deployment architecture from Raspberry Pi to CUDA-based platforms such as Jetson Xavier NX and Jetson Xavier AGX to gain a computational advantage in terms of Accuracy and Processing power.
- Worked on OpenVINO, Triton or TensorRT, Deepstream based deployment methods as a part of this exploration.
Machine Learning- Common Duties:
- Working as a part of the ML Engineering team to help convert and optimize the models into Openvino formats and also help in improving the current Object Detection models and participated in training and deployment of models.

At Quantiphi

Computer Vision for Safety:
- This project involves working on computer vision models to build a pipeline to ensure safety in theme park adventure and 4D rides. I was in the Machine learning team of this project and mostly worked on models such as Image Classification, Object Localisation, and Tracking, Pose estimation, and other cloud components to integrate the models into the pipeline.
Computer Vision for Suspect Search:
- This project involves working on CV models to build a pipeline where suspects can be searched in the video feed within seconds and to identity thefts and any anomaly activities. I primarily worked in developing Object Localisation and tracking, Person Re-identification, Search logic and clustering logic, and other cloud components.
Document Classification with Transformers:
- This project involves working on NLP-based transformer models to integrate the transformer models into a present Kubeflow pipeline and to benchmark different transformer models performance such as Open GPT-2, DistilBERT, Roberta, and Longformer in terms of latency and accuracy.
Translation Service for Document and Web:
- This project involves developing a solution that primarily concentrates on translation documents and web pages using Google Cloud APIs (Vision, Translation, Doc Parser) and Auto ML.
Hybrid Model Deployment Scenarios - Solutions research team:
- Working on NVIDIA Deep stream SDK, TLT, and connecting different models for optimizing pipeline performance in terms of latency and Accuracy.
- Working on Tensorflow Lite to understand CPU based deployments and optimizing further to check the feasibility of low-cost model deployments
- Exploration of other CPU optimizers primarily checking Intel MKL-DNN and Open Vino to understand different CPU level model optimizations.
- Exploration on Federated learning to understand privacy and distributed model developments.

Roles, Responsibilities and Interests

Professional Level:

Domain Knowledge:

Machine Learning
Deep Learning
Artificial Intelligence
Robotics
Cloud Computing
Edge Computing
Quantum Computing
Software Engineering
Mechatronics

Previous Responsibilities:

Exploration of SOTA Models
Research Paper reading
Approaches development for the use case
Helping SA’s in Architecture Development
Model Development Research
Model Development, Training, and Evaluation
Model Optimisation (Quantization and Pruning)
Model Deployment
Integrating model deployments with other cloud components
Pipeline regular maintenance metadata scripts