Suryateja Chalapati

Data Platforms & GenAI Engineer

Open to new opportunities

Onsite  ·  Hybrid  ·  Remote

Data PlatformsGenAI / LLMsGCPMLOps

Summary

Data Platforms and GenAI engineer with 8+ years of experience building scalable data systems, production AI agents, LLM-powered applications, and cloud-native analytics on GCP and AWS. Specialized in designing end-to-end data pipelines, deploying autonomous agentic AI systems, and helping teams ship production-grade GenAI solutions. Certified in both GCP Professional Data Engineering and AWS Machine Learning.

Experience

Ford Motor Company

Software Engineer – Data Platforms & GenAI · Detroit, MI

Sep 2023 – Present

Built scalable data pipelines, AI automation, and deployed production LLM agents for Data Platforms and Supply Chain Analytics.

  • Built and deployed a fully autonomous data ingestion agent on GCP handling batch formats (CSV, fixed-width, TXT, GitHub exports) end-to-end — raises its own PRs, monitors Tekton pipeline runs, tracks BigQuery load status. Reduced ingestion setup from months to minutes.
  • Deployed a ServiceNow incident resolution agent that autonomously triages alerts, enriches context, and delivers actionable recommendations to engineers via Microsoft Teams with an embedded live chat interface.
  • Designed and implemented LLM and multimodal pipelines on GCP (Vertex AI, Hugging Face) for entity recognition, summarization, and document classification at scale.
  • Developed scalable ingestion and migration workflows moving ~20 TB of data into BigQuery via Dataflow, Airflow, and GCS.
  • Built ETL pipelines using Airflow, Python, and SQL to ingest batch and streaming data from APIs, Teradata, and Oracle.
  • Applied GenAI models for NLP tasks using Transfer Learning and Deep Learning on GCP.
GCPVertex AIBigQueryDataflowAirflowTektonLLM AgentsPythonSQL

XSELL Technologies

Data Scientist – NLP · Chicago, IL

Jun 2022 – May 2023

Developed natural language models, pipelines and metrics using unstructured data for multiple high-profile clients.

  • Built spaCy Transformer NLP models achieving 93% accuracy with custom entity extraction and semantic similarity components.
  • Delivered precision/recall metrics directly enabling $500K in new business for a major client.
  • Fine-tuned ALBERT Transformer models using AWS SageMaker and HuggingFace across 7 TB of de-identified transcript data.
  • Designed end-to-end MLOps pipeline improving model operational efficiency by 60%.
spaCyTransformersALBERTAWS SageMakerHuggingFaceMLOpsPython

University of South Florida

Data Scientist & Data Engineer · Tampa, FL

Jan 2021 – May 2022

  • Estimated Kickstarter page consistency at 7% using Cosine Similarity and Semantic Analysis on unstructured text fetched via API.
  • Identified that 36% of Reddit posts in 2020 were health-related via web scraping and NLP analysis.
  • Built ETL pipelines on GCP with Airflow, migrated on-prem data to cloud data warehouse; Topic Modeling (LDA, TFIDF) yielded highest correlation of 38%.
GCPAirflowPythonNLPLDATFIDFWeb Scraping

Tampa General Hospital

Machine Learning Engineer · Tampa, FL

Feb 2020 – Dec 2020

Optimized medical clinic operations by building ML models and metrics.

  • Built a Random Forest model on patient data improving clinic operating efficiency by 38%.
  • Extracted data from S3 and Redshift; created multiple databases in AWS Glue Catalog using Glue Crawlers.
  • Automated model training in AWS SageMaker and built Tableau dashboards for operational insights.
AWS SageMakerS3RedshiftGlueRandom ForestTableauPython

Amazon

Junior Data Scientist · Hyderabad, India

Sep 2017 – Jul 2019

  • Reduced freight delay risk by 15.3% via automation tools and risk assessment metrics.
  • Predicted labor productivity with 82% precision (ALPS) leading to EU freight optimization.
  • Increased annual shipments by 34% over 6 months via A/B Testing and process development.
  • Deployed 3PL logistics sites with 92% success rate in collaboration with supply chain teams.
PythonTableauSVMLogistic RegressionA/B TestingAWS

Mahindra and Mahindra

Data Engineer · Hyderabad, India

Jun 2015 – Aug 2017

  • Optimized data imputation in S3, reducing costs by $90K and rollout times by 15.3% using AWS Athena.
  • Built ETL jobs on AWS Glue to load vendor data from multiple sources with cleaning and transformation.
  • Worked across AWS services: S3, EC2, Glue, Athena, Redshift, EMR, Kinesis, DMS, SNS, SQS.
AWS GlueS3AthenaRedshiftKinesisEMRETL

Certifications

AWS Machine Learning Specialty — Amazon Web Services

Dec 2022 – Dec 2025

GCP Professional Data Engineer — Google Cloud Platform

Dec 2022 – Dec 2024 · Renewal in progress

Technical Skills

Languages

Python (spaCy, Pandas, NumPy, scikit-learn), SQL, Java, JavaScript, R, C#

Cloud & Data

GCP (BigQuery, Dataflow, GCS, Vertex AI), AWS (SageMaker, S3, Redshift, Glue, EMR), Snowflake, Databricks

AI / ML / GenAI

LLMs, AI Agents, Transformers, spaCy, HuggingFace, LangChain, NLP pipelines, MLOps, Deep Learning, Spark MLlib

Orchestration & Tooling

Airflow, Tekton, Kafka, Spark, Docker, GKE, Terraform, Git, GitHub, Tableau, Power BI

Education

University of South Florida – MS, Business Analytics & Information Systems

Aug 2020 – May 2022

GITAM University – B.Tech, Mechanical Engineering

Jul 2012 – May 2016

Updated 2026Back to portfolio