20 days old

JPMorgan Chase & Co.
Chicago, IL 60602
  • Job Code

Director - Site Reliability Engineering

Job Description

As a Site Reliability Engineer (SRE) you will help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems, building infrastructure and reducing work through automation. You’ll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks. In this environment you’ll take the lead on relevant projects, supported by an organization that provides the support and mentorship you need to learn and grow. As an SRE you’ll be focused on running better production applications and systems.

Our Consumer & Community Banking Group depends on innovators like you to serve nearly 66 million consumers and over 4 million small businesses, municipalities and non-profits. You’ll support the delivery of award winning tools and services that cover everything from personal and small business banking as well as lending, mortgages, credit cards, payments, auto finance and investment advice. This group is also focused on developing and delivering cutting edged mobile applications, digital experiences and next generation banking technology solutions to
The Executive Director of production Management will build relationships and work with the business and IT leadership to understand current and develop new strategies and implications on applications and operations. Utilize experience with budgetary forecasting, Global resource teams, client and vendor management, risk and vulnerability management, disaster recovery planning and developer/code efficiency concepts. Provide input, influence decisions, and drive change across technology, not limited by organizational lines or to activities within their own department.
• Lead teams that design, code, test and deliver software to ensure application performance and resiliency
• Manage the outcome of priority incidents through facilitating blameless post-mortems, driving for a higher accuracy and accountability in a product, application or service objective
• Lead development teams throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes
• Own the overall customer experience and sustainability of a product and application

• Expertise in multiple technology stacks with designing, coding, testing , delivering software
• Mastery of some of the infrastructure components. (E.g. routing, load balancers, cloud products , container systems , compute, storage)
• Proven leadership of SRE teams and firm wide initiatives
• Proven leadership in performance monitoring and capacity management of large systems using various tools
• Deep understanding of Site Reliability Engineering (SRE) philosophy, Chaos Engineering, technologies, platforms and tools, SLA management, incident resolution, and automation
• Hands on experience on managing operations of large scale internet-centric production environments for application or infrastructure services serving tens to millions of end users
• 10+ years of software engineer experience and/or site reliability engineering in one of the following languages: C, C++, Java J2EE technology stack and web technologies /, Python, Go, Perl, Ruby or shell scripting (Unix/Linux)
• Hand-on experience with cloud-based technologies and tools especially in deployment, monitoring and operations, such as Kubernetes, Prometheus, FluentD, Slack, Elasticsearch, Grafana, Kibana, etc.
• 7+ years’ experience in
 Developing monitoring tools and log analysis tools to manage operations
 Managing and/or influencing infrastructure services to ensure application service uptime and user experience
 Developing and managing operations leveraging key event streaming, messaging and DB services such as Cassandra, MQ/JMS/Kafka, Aurora, RDS, Cloud SQL, BigTable, DynamoDB, MongoDB, Cloud Spanner, Kinesis, Cloud Pub/Sub, etc.
• Prior experience in large scale internet companies/technologies, where uptime and continuous availability was core to the business
• Building a team of engineers and Java developers to implement SRE frameworks
• Working with Architecture to design reusable patterns to deploy to applications, provide governance around adoption, and influence application development teams on roadmaps and designs
• Identifying and partnering with Infrastructure teams and AD teams to implement automation opportunities to drive down toil and reduce technical debt
• Applying standards of cloud compliance to application design to achieve reliability
• Understanding of Networking and cloud technologies, for example Security, Load Balancing, Network routing protocols
At JPMorgan Chase & Co. we value the unique skills of every employee, and we’re building a technology organization that thrives on diversity. We encourage professional growth and career development, and offer competitive benefits and compensation. If you’re looking to build your career as part of a global technology team tackling big challenges that impact the lives of people and companies all around the world, we want to meet you.

Additional closing language (optional):
Ready to use your expertise and experience to drive change? Apply today.
It's time to take your career to the next level, and we can help. Apply today.
Apply today, and put your passion for technology to work at JPMorgan Chase & Co.

Req #: 190108544
Location: Chicago, IL US
Job Category: Technology
Employment Type: Full Time
Potential Referral Amount: 5000 US Dollar (USD)

Keyword: consumer%20banking


Posted: 2020-05-06 Expires: 2020-06-13

Before you go...

Our free job seeker tools include alerts for new jobs, saving your favorites, optimized job matching, and more! Just enter your email below.

Share this job:

JPMorgan Chase & Co.
Chicago, IL 60602

Join us to start saving your Favorite Jobs!

Sign In Create Account
Powered ByCareerCast