Overview
Job Purpose The Systems Operations Analyst plays a vital role in supporting the daily operations of multiple industry-leading trading exchanges, with a focus on the infrastructure components that ensure their seamless functionality. This internal-facing position provides immediate assistance to a broad range of ICE/NYSE stakeholders-including developers, engineers, back office personnel, support teams, and IT staff-to maximize customer satisfaction and minimize the impact of IT-related issues. This role demands a motivated, team-oriented individual who can work independently, drive projects to completion, and contribute beyond their core responsibilities. The ideal candidate will bring a strong blend of technical and business acumen, along with a comprehensive understanding of the architecture of ICE/NYSE exchanges, divisions, clearing systems, and infrastructure. This is a 24x7 operational environment. The position requires a five-day in-office schedule and may involve varying shifts and weekend work, depending on business needs. Responsibilities
- Automation
- Identify automation candidates to assist with Disaster Recovery and incident remediation.
- Contribute on automation projects including scripting and building of automation jobs.
- Collaborate with other team members (internal and external) on automation initiatives.
- Investigate and troubleshoot issues with existing automation.
- Incident Management
- Monitor systems and applications within the production environment.
- Diagnose and fix incidents raised through monitoring tools, conference bridges, and chats.
- Work with and escalate to internal and external teams to implement incident fixes, workarounds, and data recovery.
- Open and update production incident tickets according to company standards.
- Problem Management
- Investigate and update incident tickets with root cause and incident description, ensuring appropriate corrective action follow-up tickets are assigned.
- Manage incident tickets to closure, ensuring incident details are complete and accurate, and all corrective actions have been completed.
- Participate in continuous improvement programs, such as trend analysis of recurring issues.
- Provide and report on performance metrics of the environment.
- System and Application Production Readiness
- Work with internal and external teams to expand and maintain operational runbooks and other documentation.
- Check application and infrastructure availability and tasks at scheduled times.
- Configure monitoring tools and alarms.
- Change Management
- Ensure successful prioritization, approval, scheduling, and execution of production and DR environment changes
- Production deployments.
- Approve and execute production deployment tasks.
- Disaster Recovery Management
- Participate in disaster recovery, business continuity, and workplace recovery events.
Knowledge and Experience
- 3+ Years of cumulative, full-time experience preferred
- Experience with enterprise monitoring solutions a plus
- Bachelor's degree (IT-based) or experience within IT systems support and/or operational support of applications databases within a Linux/Unix OS environment, or equivalent experience
- Proficiency in one or more scripting languages (e.g., Shell Scripting, Python, PowerShell)
- Understanding of network protocols and security concepts.
- Strong knowledge of operating systems (Windows, Linux, macOS).
- Strong problem-solving and analytical skills.
- Excellent communication skills (both written and verbal).
- Ability to work within a team and across teams
- Ability to be organized and decisive while under pressure
- High level of general IT skills with MS Office Applications
- Capacity to relate technical concepts and events to non-technical stakeholders.
Specific Technologies Rundeck, Pagerduty, BigPanda, RHEL #LI-JM1
|