Professional Experience
Senior SRE Engineer
Meituan
2020–Present
- Managed a 5,000+ node Kubernetes multi-tenant platform.
- Designed GitOps pipelines using ArgoCD, supporting 10,000+ monthly automated deployments.
- Led disaster recovery drills, achieving PostgreSQL recovery within 30 minutes.
- Implemented service throttling, readiness probes, and automated circuit breakers.
DevOps Engineer
Didi Chuxing
2016–2020
- Migrated services from Mesos to Kubernetes, enabling microservice scalability.
- Developed a Prometheus-based SLA scoring tool for automated availability tracking.
- Built secure CI/CD workflows with GitLab CI and Harbor.
- Created internal Python-based observability and diagnostics tools.
Operations Engineer
LeTV
2013–2016
- Maintained 1,000+ transcoding and CDN nodes using Ansible.
- Built Zabbix + ELK logging system for automated monitoring and alerting.
- Optimized Nginx edge caching logic with Lua to improve CDN efficiency.
Junior System Administrator
Kugou Music
2010–2013
- Maintained 600+ Linux servers across Web, DB, and Cache services.
- Implemented MySQL master-slave replication and read/write splitting.
- Automated daily ops with shell scripts for backups, monitoring, and log rotation.