Suryateja Chalapati
Data Platforms & GenAI Engineer
Onsite · Hybrid · Remote
Summary
Data Platforms and GenAI engineer with 8+ years of experience building scalable data systems, production AI agents, LLM-powered applications, and cloud-native analytics on GCP and AWS. Specialized in designing end-to-end data pipelines, deploying autonomous agentic AI systems, and helping teams ship production-grade GenAI solutions. Certified in both GCP Professional Data Engineering and AWS Machine Learning.
Experience
Ford Motor Company
Software Engineer – Data Platforms & GenAI · Detroit, MI
Sep 2023 – Present
Built scalable data pipelines, AI automation, and deployed production LLM agents for Data Platforms and Supply Chain Analytics.
- Built and deployed a fully autonomous data ingestion agent on GCP handling batch formats (CSV, fixed-width, TXT, GitHub exports) end-to-end — raises its own PRs, monitors Tekton pipeline runs, tracks BigQuery load status. Reduced ingestion setup from months to minutes.
- Deployed a ServiceNow incident resolution agent that autonomously triages alerts, enriches context, and delivers actionable recommendations to engineers via Microsoft Teams with an embedded live chat interface.
- Designed and implemented LLM and multimodal pipelines on GCP (Vertex AI, Hugging Face) for entity recognition, summarization, and document classification at scale.
- Developed scalable ingestion and migration workflows moving ~20 TB of data into BigQuery via Dataflow, Airflow, and GCS.
- Built ETL pipelines using Airflow, Python, and SQL to ingest batch and streaming data from APIs, Teradata, and Oracle.
- Applied GenAI models for NLP tasks using Transfer Learning and Deep Learning on GCP.
XSELL Technologies
Data Scientist – NLP · Chicago, IL
Jun 2022 – May 2023
Developed natural language models, pipelines and metrics using unstructured data for multiple high-profile clients.
- Built spaCy Transformer NLP models achieving 93% accuracy with custom entity extraction and semantic similarity components.
- Delivered precision/recall metrics directly enabling $500K in new business for a major client.
- Fine-tuned ALBERT Transformer models using AWS SageMaker and HuggingFace across 7 TB of de-identified transcript data.
- Designed end-to-end MLOps pipeline improving model operational efficiency by 60%.
University of South Florida
Data Scientist & Data Engineer · Tampa, FL
Jan 2021 – May 2022
- Estimated Kickstarter page consistency at 7% using Cosine Similarity and Semantic Analysis on unstructured text fetched via API.
- Identified that 36% of Reddit posts in 2020 were health-related via web scraping and NLP analysis.
- Built ETL pipelines on GCP with Airflow, migrated on-prem data to cloud data warehouse; Topic Modeling (LDA, TFIDF) yielded highest correlation of 38%.
Tampa General Hospital
Machine Learning Engineer · Tampa, FL
Feb 2020 – Dec 2020
Optimized medical clinic operations by building ML models and metrics.
- Built a Random Forest model on patient data improving clinic operating efficiency by 38%.
- Extracted data from S3 and Redshift; created multiple databases in AWS Glue Catalog using Glue Crawlers.
- Automated model training in AWS SageMaker and built Tableau dashboards for operational insights.
Amazon
Junior Data Scientist · Hyderabad, India
Sep 2017 – Jul 2019
- Reduced freight delay risk by 15.3% via automation tools and risk assessment metrics.
- Predicted labor productivity with 82% precision (ALPS) leading to EU freight optimization.
- Increased annual shipments by 34% over 6 months via A/B Testing and process development.
- Deployed 3PL logistics sites with 92% success rate in collaboration with supply chain teams.
Mahindra and Mahindra
Data Engineer · Hyderabad, India
Jun 2015 – Aug 2017
- Optimized data imputation in S3, reducing costs by $90K and rollout times by 15.3% using AWS Athena.
- Built ETL jobs on AWS Glue to load vendor data from multiple sources with cleaning and transformation.
- Worked across AWS services: S3, EC2, Glue, Athena, Redshift, EMR, Kinesis, DMS, SNS, SQS.
Certifications
AWS Machine Learning Specialty — Amazon Web Services
Dec 2022 – Dec 2025
GCP Professional Data Engineer — Google Cloud Platform
Dec 2022 – Dec 2024 · Renewal in progress
Technical Skills
Languages
Python (spaCy, Pandas, NumPy, scikit-learn), SQL, Java, JavaScript, R, C#
Cloud & Data
GCP (BigQuery, Dataflow, GCS, Vertex AI), AWS (SageMaker, S3, Redshift, Glue, EMR), Snowflake, Databricks
AI / ML / GenAI
LLMs, AI Agents, Transformers, spaCy, HuggingFace, LangChain, NLP pipelines, MLOps, Deep Learning, Spark MLlib
Orchestration & Tooling
Airflow, Tekton, Kafka, Spark, Docker, GKE, Terraform, Git, GitHub, Tableau, Power BI
Education
University of South Florida – MS, Business Analytics & Information Systems
Aug 2020 – May 2022
GITAM University – B.Tech, Mechanical Engineering
Jul 2012 – May 2016
