About

Senior Site Reliability / DevOps Engineer with 12+ years of experience designing, automating, and operating reliable cloud infrastructure across Azure, AWS, and GCP. Strong background in Kubernetes, Terraform, CI/CD, observability, incident management, and infrastructure automation. Proven ability to lead SRE initiatives, improve production reliability, implement scalable monitoring platforms, and partner with engineering teams to strengthen operational excellence.

Experience

  • #

    Site Reliability Engineer

     —    2 years

    • Established and led AMH's first Site Reliability Engineering team, defining reliability practices, incident response processes, and service ownership standards across critical systems.
    • Spearheaded the migration from Microsoft Application Insights to OpenTelemetry, standardizing distributed tracing and improving application performance visibility across services.
    • Led the migration from Azure Monitor to Grafana Cloud, implementing a unified observability platform using Loki, Tempo, Prometheus, dashboards, and alerting.
    • Implemented Infrastructure as Code with Terraform to automate provisioning and management of Grafana dashboards, alerts, and observability infrastructure.
    • Defined and implemented Service Level Indicators and Service Level Objectives to measure reliability, improve operational transparency, and align service performance with business expectations.
    • Designed proactive monitoring and alerting strategies to improve production issue detection, reduce response times, and support faster incident resolution.
    • Led production incident investigations and Root Cause Analysis, documenting findings and driving preventative actions to reduce repeat incidents.
    • Partnered with software engineering, DevOps, QA, SecOps, and IT operations teams to improve reliability, telemetry instrumentation, deployment practices, and operational visibility across Azure-hosted services.
  • #

    Senior DevOps Engineer

     —    2 years

    • Led the migration from GitLab CI to GitHub Actions, reducing operational costs and standardizing CI/CD workflows across engineering teams.
    • Supported AWS infrastructure for data science, application development, and engineering workloads.
    • Implemented Infrastructure as Code using Terraform, Ansible, GitLab, and Rundeck to automate provisioning, configuration management, and deployments.
    • Migrated manually managed GitLab CI runner infrastructure to Amazon EKS, improving scalability, reliability, and operational maintainability.
    • Automated Kubernetes deployments using Flux and ArgoCD, enabling GitOps-based delivery for containerized services.
    • Developed Python and Bash automation to streamline DevOps workflows and support a modular Terraform infrastructure stack.
    • Implemented Consul health checks and service monitoring to improve application reliability and operational visibility.
  • #

    Site Reliability Engineer

     —    2 years

    • Played a key role in transitioning infrastructure from on-premises operations to Google Cloud Platform and later to Azure, supporting scalable and reliable cloud adoption.
    • Configured and deployed GCP Kubernetes environments, establishing deployment processes for 50+ applications.
    • Created and managed multiple GCP projects across development, staging, and production environments, automating infrastructure provisioning with Terraform.
    • Used Terraform, GitLab CI, and Ansible to automate infrastructure provisioning, configuration management, and deployments across Azure and GCP.
    • Spearheaded the transition from Azure DevOps to GitLab, implementing Git-based workflows and modern version control practices across engineering teams.
    • Implemented automated testing in GitLab CI and integrated static and dynamic code analysis tools, reducing security vulnerabilities by 85%.
    • Configured application performance monitoring with New Relic, improving performance visibility, troubleshooting, and user experience.
  • #

    DevOps Engineer

     —    a year

    • Maintained and enhanced a product demo platform hosted on AWS, ensuring a reliable and seamless experience for prospective customers.
    • Implemented DevOps best practices, including Infrastructure as Code, CI/CD pipelines, and configuration management using Jenkins and Groovy.
    • Created a Jenkins Shared Library to standardize and streamline build and deployment processes across projects.
    • Led the migration from OpenBSD to Linux, enabling broader DevOps tooling, improved maintainability, and modern configuration practices.
    • Developed front-end and back-end components for a security appliance UI using React, Redux, Ruby, Sinatra, RSpec, and Capybara.
  • #

    DevOps EngineerYardi Systems Inc.

     —    5 years

    • Maintained 99.99% availability for production environments across hybrid infrastructure.
    • Created Docker images, Docker Compose configurations, Kubernetes manifests, and ECS parameter files to support containerized deployments.
    • Managed DevOps tools and platforms including Jenkins, GitHub, TeamCity, Consul, and Ansible.
    • Containerized Nginx reverse proxy servers and implemented dynamic container provisioning to improve scalability, reliability, and security.
    • Implemented AWS infrastructure components including VPCs, subnets, Elastic Load Balancers, CloudFront, Cloudflare, S3, and EC2.
    • Developed automation and operational tooling using C#, VB.NET, and shell scripting.

Skills

  • #

    Cloud Technologies: AWS, GCP, Kubernetes, Azure, Amazon EKS, ECS Intermediate

    • AWS
    • GCP
    • Azure
    • Kubernetes
    • Amazon EKS
    • ECS
  • #

    Programming Languages: Python, Ruby, C#, VB.NET Intermediate

    • Python
    • Ruby
    • C#
    • VB.NET
    • Javascript
  • #

    Infrastructure as Code & Automation: Terraform, Ansible, Rundeck, Consul, Packer Expert

    • Configuration Management
    • Terraform
    • Ansible
    • Rundeck
    • Consul
    • Packer
  • #

    Scripting Language: UNIX Shell Scripting(bash, sh, zsh), Groovy, & Python Intermediate

    • Python Scripting
    • Shell Scripting
    • Groovy Scripting
  • #

    Databases: PostgreSQL, SQL Server, MySQL, SQLite, Amazon RDS/Aurora Expert

    • Databases
    • Postgresql
    • MySQL
    • SQL Server
    • SQLite
    • Amazon RDS
    • Aurora
  • #

    Networking & Cloud Infrastructure: VPC, NAT, Subnets, Load Balancers, TCP/IP, CDNs, Firewalls, CloudFront, Cloudflare Expert

    • Networking
    • Firewalls
    • NAT
    • VPC
    • Load Balancers
    • CDNs
    • CloudFront
    • Cloudflare
  • #

    Web Server: Nginx, Apache, IIS Expert

    • Nginx
    • Ingress
    • Apache
    • Load Balancer
    • IIS
  • #

    Version Control System: Git, Subversion, Mercurial Expert

    • Git
    • VCS
    • SVN
    • HG
  • #

    Observability & Monitoring: OpenTelemetry, Grafana, Prometheus, Loki, Tempo, New Relic, Dynatrace, Honeycomb, Azure Monitor, Application Insights Intermediate

    • Monitoring
    • OpenTelemetry
    • Grafana
    • Prometheus
    • Loki
    • Tempo
    • New Relic
    • Dynatrace
    • Honeycomb
    • Azure Monitor
    • Application Insights
  • #

    CI/CD & GitOps: GitHub Actions, GitLab CI, Jenkins, TeamCity, ArgoCD, Flux, Azure DevOps Expert

    • Automation
    • CI/CD
    • Jenkins
    • GitLab CI
    • GitHub Actions
    • TeamCity
    • ArgoCD
    • Flux
    • Azure DevOps
  • #

    Platforms: Linux, Windows, OSX, OpenBSD/Unix Expert

    • Linux
    • OSX
    • Debian
    • Ubuntu
    • Centos
    • OpenBSD
  • #

    Leadership Expert

    • Team Leadership
    • Engineering Excellence
    • Mentoring

Awards

  • #

    Scrum Master Accredited Certification

  • #

    Scrum Product Owner Accredited Certification

Languages

  • #

    EnglishNative speaker