SRE and Operations Engineer

We empower you to make a difference. Be sure, go further, act smarter!

We are looking for an SRE and Operations Engineer to join our team.

Whats In It for You:

This is a good job for you if you want to be responsible for the provision of automated processes related to the construction and implementation of software in GCP/AWS, managed hosting, and developing scripts/software necessary for all SRE activities.

What Can You Expect to Be Doing as an SRE and Operations Engineer:

  • Design, implement, and maintain fully automated infrastructure as a code to integrate in a CI/CD pipeline;
  • Be part of a 24/7 on-call SRE team;
  • Maintain platforms after launch by measuring and monitoring availability, performance, and overall system status;
  • Run the production environment by monitoring data availability, latency, and quality, taking a holistic view of system health;
  • Develop the product with emphasis on the link between the end-user and the Product owner;
  • Troubleshoot, assess and resolve operational challenges and support escalation. Recover platforms during production incidents to meet targeted SLO with the end-users as a priority;
  • Compile blameless Postmortems with a focus on improving performance moving forward;
  • Work on the Error Budget with the Product Owner to implement the needed features;
  • Perform cyclic maintenance and updates of the operational landscape (software versions, patches, support for SW/HW maintenance activities of the provider, etc.);
  • Proactively identify and implement operational improvements for processes, performance, and reliability;
  • Handle operations related to monitoring/alerting of SLA-critical production platforms, resolving issues and manual intervention (close cooperation with software development and Ipsos Dev teams);
  • Ensure reliable operation of applications for projects across the organization, also influence shaping our delivery and mindset, so that IT and good business are aligned;
  • Provide active support during the design and guidance of developing new services for our platforms. Work as a senior member of a Scrum team;
  • Participate in the development and execution of the SRE strategy;
  • Apply the mindset and practices of the software engineer.



The SRE and Operations Engineer job might be a good fit for you if you have:

  • 5+ years of experience, including DevOps, Software Engineering Site Reliability Engineer (SRE), and on-call rotations, working on highly scalable distributed systems;
  • 2+ years of experience with Power Shell scripting and Python;
  • Bachelors or masters degree in computer science or related fields or equivalent experience;
  • Experience managing Windows OS (2019,2022), Linux;
  • Experience working with monitoring systems Zabbix, Prometheus, or similar;
  • Experience using Code Version Control like Git;
  • Experience with orchestration/automation tools (Ansible, Terraform, Packer, etc.);
  • Experience with container technologies (Docker, Kubernetes);
  • Experience deploying code using CI/CD tools like GitHub within change management procedures;
  • Experience of working with Jira;
  • Cloud-native application development and Cloud Technologies in GCP, AWS;
  • Strong understanding of cloud concepts, on-premises infrastructure, and platforms;
  • Proven ability to quickly learn new technical domains and then train others;
  • Great verbal and written communication skills;
  • Knowledge of web security, networking, and application architecture.

Benefits:

  • Flexible working
  • Flexible Benefits platform
  • Employee Assistance Program
  • Friendly working environment
  • Career path
  • Online learning platform
  • Rewarding program
  • Internal events
  • Referral & Seniority bonus
  • Additional Vacation days.

Location

Bucharest, Romania

Job Overview
Job Posted:
7 months ago
Job Expires:
Job Type
Full Time

Share This Job: