Responsibilities
- Responsible for all aspects of network and server Operations in a fast paced environment.
- Improving performance, monitoring, and overall stability of our platform.
- Builds automated deployments for consistent software releases with zero downtime (through build process, packaging, testing and automatic deployment)
- Interacting with the Engineering for supporting/maintaining/designing backend infrastructure for product support.
- Participates in the development of contingency plans including reliable backup and restore procedures.
Required Skill Set
- Expertise in installing, configuring and troubleshooting Linux based environments.
- Expertise with Firewall and load balancer (IPtables and HAProxy)
- Solid experience in the administration and performance tuning of application stacks (e.g.,Tomcat, JBoss, Apache, NGINX)
- Solid scripting skills (e.g., shell scripts, Perl, Ruby, Python)
- Hands on experience with AWS
- Solid knowledge with automation (e.g., Saltstack, Ansible)
- Expertise with Infrastructure monitoring (e.g., Zenoss, Sensu)
- Expertise with Log aggregation and Analysis (e.g., Elasticsearch, Logstash, Graylog, Packetbeat)
- Good with Dashboards (e.g., Graylog, Kibana, Grafana)
- Good with Open time series databases (e.g., Graphite, OpenTSDB, InfluxDB)