Jobs Description
Provides 24/7 technical operations support for cloud based solutions to clients on supported application, DevOps, middleware, security and infrastructure. As a single integrated E2E operation support that includes Level 1.5 remediation across the fleet. Following a standard workflow and incident handing processes, they receive and record incident related information using a variety of tools and process, selects appropriate actions to resolve issues and communicates the solution or action plan tothe client. Supports number of tools part of the Integrated toolchain. They use professional knowledge and problem determination / source identification skills to resolve problems involving APIs, application services, IaaS, PaaS, SaaS, microservices, containers, Kubernetes nodes, ICP management, middleware components, network, security and infrastructure issues alike. If unable to resolve, will triage and route the incident to the appropriate level of support. Understand high level cloud application architecture and the ability to do initial analysis on incidents. Provide Application ID management support. Provide cloud elasticity by auto scaling up/down of resources based on the business requirements. Provide DR and manual redundancy fail overs. Provide daily, weekly & monthly integrated service management reports across the solution. Would work on ticketing tool such as ServiceNow, IBM Control Desk & Remedy and automation tools such as Rundeck and IBM Runbook Automation. Should understand the ITIL processes of Incident (including Critical Incident Management), Problem, Change management and Integrated Service Level Management. Should have used monitoring tools such as IBM APM, NewRelic, Runscope and Netcool OmniBus to monitor the client's environment. Technical understanding of IBM Cloud platform (Bluemix PaaS), Container management, Kubernetes node, ICP management, HA infrastructure and load balancers.