Senior Linux Engineer (DevOps)

  • Contract

Job Description

Job Overview:


The team that the selected candidate will be a part of is responsible for the core engineering, systems architecture, and technical operations of the company’s Watch platforms. The team is responsible for the design, build and development of large scale, resilient platforms and systems that encompass the all of our products and services.


They collaborate closely with the company’s development and technical teams to provide these robust, distributed and performant solutions to achieve the company’s business goals.


The team is also responsible for systems architecture, platform configuration management, capacity management, site reliability, application support, monitoring, and Tier 2 incident response.


The Sr. Systems Engineer works as part of a team responsible for planning and executing the integration of Internet based products and services for the company into a complex data center and cloud environment in an efficient, secure, scalable, reliable and cost effective manner.

Qualifications

Responsibilities:


Planning


Technology Platform & Project Planning/Management (Operations)

Ensures the use of performance data and historical metrics to effectively plan for growth, upgrades, migrations and optimization

Coordinates with various teams to provision and facilitate delivery of systems and applications

Operations


Provides and maintains documentation of systems architecture, troubleshooting and support guidelines, system metrics, project information and plans, and training information for both Systems Engineering and Service Operations staff

Executes day-to-day maintenance tasks, software/platform/configuration updates, and resolves live-site issues. Develops scripts and tools to improve administration and support

Ensures solutions are maintainable and performs on-call duties

Service Management


Participates in the continual refinement of processes and policies to ensure the highest possible performance and availability of our systems

Participates in the development of best practices including capacity planning, monitoring, configuration, security, historical metrics, recovery strategies and migration strategies

Leverages ITIL framework to support service delivery

Basic Qualifications:


5+ years of operating complex, large-scale enterprise guest facing applications or web sites

Experience working in a high capacity, highly scalable mission-critical web serving environment

Experience with F5 load balancing and IRules

Unix/Linux and Windows server experience, including expertise in system installation, configuration, administration, troubleshooting, performance tuning, preventative maintenance, capacity planning, monitoring, and security procedures

Knowledge and experience with emergent web and enterprise business applications:

oWeb (Apache) and Java application (Tomcat) server expertise including installation, administration, configuration, troubleshooting, performance tuning, preventative maintenance, capacity planning, monitoring, and security procedures

Experience in at least two relevant scripting or programming languages (Ruby, Python, Shell, etc)

Experience with updating and using configuration management tools (Chef, Puppet, etc)

Experience with administration and management of Amazon Web Services

Experience working with Docker Containers is a plus

Experience supporting NoSQL, Triple Store, Graph databases is a plus

Experience with using development, build, and deployment and integration tools (Git, Jenkins, Nexus, Rundeck)

Experience with distributed, open source monitoring frameworks

Experience with developing performance testing plans and implementing application performance testing with tools such as JMeter.

Understanding of internet standards such as HTTP, DNS, HTML, XML and other protocols

Excellent verbal and written communication skills when working with technical groups

Experience working with Service Management best practices required