Machine learning engineer

About Me

Machine Learning Engineer specializing in LLM systems, RAG pipelines, and on-prem model deployment. Experienced in building production-grade AI agents and optimizing large models for real-time enterprise use.

Experience

Rapid Acceleration Partners

Machine learning Engineer - Intern

Jul 2023 – Jan 2024

https://www.linkedin.com/company/rapid-acceleration-partners/

Practical AI Solutions for Digital Business Transformation.

Fine-tuned LLMs (Mistral) and visual language models for real-time document extraction, object detection, and structured data generation from unstructured enterprise data. Built hyper-automation pipelines integrating LLM inference with backend systems for AI-driven document workflows.

Trycom.ai

Machine learning Engineer

Jan 2024 – Mar 2025

https://www.linkedin.com/company/trycom/

TRYCOM AI : Simplifying content marketing with AI precision.

Built AI content automation pipelines including real-time image generation, blog/post creation with humanized text, web scraping for knowledge augmentation, LLM caching, and internal Streamlit tooling. Fine-tuned and converted BERT-based spam classifiers to ONNX for low-latency real-time inference in production. Owned the backend ML platform: managed Docker deployments, MongoDB-based LLM credit tracking, Redis caching, Celery queues, and GitHub PR reviews with code refactoring. Implemented Langfuse for LLM observability and optimized inference via structured outputs, prompt tuning, and refactored local Mistral pipelines for improved performance.

Stratforge

Machine learning Engineer

Mar 2025 – Present

https://www.linkedin.com/company/stratforge/

Breakthrough AI-human integration for enterprise transformation.

Built RAG pipelines with document extraction and Agentic RAG using MCP, ReAct agents, and LLM APIs. Developed LLM-based AI agents and real-time voice agents using MCP and LiveKit. Deployed and optimized on-prem LLMs using vLLM, including custom kernel-level performance tuning. Tuned LLM prompts, developed custom machine learning statistical models, and deployed services using Docker and FastAPI endpoints.

Projects

Techbytes AI

Instagram Page

An Ai Bot that Tracks the Recent information (For every six hours) about a given Topic and Creates a Insta or Fb or Twitter Post with Image and uploads it using the Platforms API.

This project is right now hosted on ec2 and implemented with the help of Groq, Perpleixity discover (right now), Scrapeops API’s. The post are created using a smart Image-Text embedding program.The live demo for this project is running successfully on techbytes insta page.These posts are uploaded for every six hours using instagrams content publishing API with AI summarized content and hashtags.

Perplexity Clone

Github page

Perplexity Clone that could use Groq's LPU power and could work as your Upto date knowledged AI assistant

This project Focuses on scraping the top google results or you Brave’s search api to get top results and by Scraping the Articles we could provide the LLM recent knowledge about things happening and can be questioned about recent incidents.Groq API is being used for LLM inference and Nomic embedding is used for RAG implementations. Qdrant vector database is used to store the vector embeddings and retrieve it faster.

Moviesense

Website Link

An Ai system that suggests movies related to the given prompt. this system returns results from 4 lakh movies in milliseconds in a 1GB RAM system.

This project is built using customized sentence transformers. nearly 4 lakh movies are optimized embedded and stored in a vector database. this system is hosted in a docker service in AWS free ec2 instance of 1GB ram. systems more than 4GB ram can run a higher accuracy ouput with rag implementation.

Tamil Bert

Model link

this model is a bert model trained as Masked language modelling task which can be used for many tasks like clasification, named entity recognition etc.

The model and the tokenizer is trained from scratch for the language tamil. this model is trained with 10.6 Million tamil sentences on a p100 for 18 hrs resulting in a better evaluation score for other tasks.

NERF

Github link

nerf Neural radiance field is a technology that converts a bunch of 2D images to 3D model.

using the original implementation in tensorflow i reassembled the code for pytorch. this allows to use customized features that pytorch offers.

SVCE-BUS

a progressive web app which tracks the location of the college bus without the help of gps

this webapp is developed on react js and django uses the drivers mobile to track the drivers location and sents the coordinates to the students. this uses leaflet open source map framework to display the coordinates. this system was professional with security features and login systems.

OCR_From_Scratch

Github Link

OCR for language tamil

Optical Character Recognition (OCR) is the technology that allows the conversion of printed or handwritten text into digital text. It has been widely used for various applications, including language translation, text mining, and document digitization. Tamil OCR specifically focuses on recognizing and extracting text from documents written in the Tamil language.

FACIT

webapp

A simple classification system that classifies gender based on the image

Developed on convolutional layers with the data scrapped from google images

Obscene Detector

Obscene detector that detects NSFW image

this system uses masktransformer to detect the human outline of the image and then uses a classification model to classify.

wikipedia datascrapping

Scrapped 36000 movies plot and 2000 songs from various websites including wikipedia

gained experience by datascrapping many websites for self supervised training

Education

Sri venkateswara college of engineering

B.tech Artificial intelligence and Data science

2020 - 2024

Pursued my degree on artificial intelligence and data science with a CGPA of 8.59. Devloped interest on deep neural networks and data processing here.

Achievements

Analytics Showdown

Awarded first place in “ANALYTICS SHOWDOWN”, a data analytics competition hosted by KNOW-I.

LeetCode Profile

Profile Link

Completed 800+ LeetCode problems including 120+ Hard, maintaining a 686-day coding streak.

Skills

Programming Languages

Python, C/C++, JavaScript, SQL, HTML, CSS, Bash/Shell

AI & Machine Learning

PyTorch, TensorFlow, Transformers, LLMs (Fine-tuning, RAG, Agents), vLLM, llama-cpp, LangChain, LangGraph, Langfuse, MCP, ONNX, BERT, Computer Vision, NLP, Scikit-learn, NumPy, Pandas, Matplotlib, Seaborn, OpenCV, Groq SDK, Nomic Embeddings

Web & Backend Development

FastAPI, Flask, Django, React.js, Streamlit, LiveKit, Celery, Leaflet.js

Databases, Cloud & Tools

MySQL, MongoDB, Redis, Qdrant (Vector DB), Docker, AWS (EC2), GCP, Git, GitHub Actions, Linux, Jupyter, Selenium, BeautifulSoup, ScrapeOps API

Deepak Victor