Site Reliability Engineer

Ελλάδα
Μόνιμη
Πλήρης Απασχόληση

Πριν 27 ημέρες

At Paymentology, we're redefining what's possible in the payments space. As the first truly global issuer-processor, we give banks and fintechs the technology and talent to launch and manage Mastercard, Visa, and UnionPay cards at scale - across more than 60 countries.Our advanced, multi-cloud platform delivers real-time data, unmatched scalability, and the flexibility of shared or dedicated processing instances. It's this global reach and innovation that sets us apart.We're looking for a Site Reliability Engineer to ensure the high availability, scalability, and performance of our platform. This role is essential to maintaining reliable systems, reducing operational overhead, and enabling continuous improvement across our global technology landscape. If you're passionate about automation, incident response, and working at the intersection of infrastructure and software, this is your opportunity to help build resilient systems that power financial inclusion worldwide.What you get to do::Platform Reliability and Scalability

Build software that enhances Paymentology services' scalability and reliability.
Ensure platform services meet required uptime and service quality levels.
Contribute to the design of reliable cloud infrastructure and implement reusable cloud-uptime components as code.
Regularly review and optimise SRE practices, tools, and methodologies to enhance overall system reliability and team efficiency.

Observability and Automation

Contribute to the design, implementation, and maintenance of observability and monitoring solutions to track the platform health, its cost-effectiveness, the reliability, and scalability, and identify potential issues which can be fed back to product and platform engineering in a continuous improvement loop.
Develop and implement automation scripts and tools to streamline operations and reduce manual interventions.
Enable product teams to self-serve by participating in the development of a developer platform.

Production Issue Resolution

Play an active role with the incident response teams, diagnosing and resolving production issues quickly to minimise downtime.

Standards Compliance

Support product teams in building services that adhere to our security and quality standards.

Cross-team Collaboration

Work closely with engineering, operations, and product teams to ensure reliability is considered throughout the end-to-end software development lifecycle. We seek to achieve this through advocacy and developing a culture of reliability.**

Requirements:What it takes to succeed:

Strong understanding of cloud networking principles.
Proficiency with leading monitoring tools, such as Datadog, Honeycomb.io, Splunk, Prometheus, Grafana, ELK Stack, and New Relic.
Programming expertise, especially in systems programming languages and databases
Familiarity with one of these industry-leading CI/CD tools such as Jenkins, GitHub Actions, Gitlab CI, CodePipelines, CircleCI, and ArgoCD.
Proven in achieving platform-level and end-to-end SLIs, SLOs, and SLAs, and fostering accountability.
Ability to navigate complex situations and lead effective post-incident reviews (PIRs).
Knowledge of implementing solutions to reduce Mean Time to Identify (MTTI) and Mean Time to Resolve (MTTR).
Comprehensive understanding of large-scale distributed platform architecture.
Expertise in implementing best practices for load balancing, fault tolerance, and resource allocation to maintain service quality and efficiency at scale.
Understanding of security best practices within cloud environments.

Education and Experience:

Bachelor's Degree in Computer Science, Information Technology, or related field.
Professionals with a verifiable employment history in the role may also be considered.
2+ years of experience as a Site Reliability Engineer.
2+ years in software development.
Extensive cloud experience, especially with AWS.
Proven expertise in one of the infrastructure-as-code using Terraform, CloudFormation, Puppet, and Ansible.
Hands-on experience with Docker, ECS, EKS, and Kubernetes.

What you can look forward to::At Paymentology, it's not just about building great payment technology, it's about building a company where people feel they belong and their work matters. You'll be part of a diverse, global team that's genuinely committed to making a positive impact through what we do. Whether you're working across time zones or getting involved in initiatives that support local communities, you'll find real purpose in your work - and the freedom to grow in a supportive, forward-thinking environment.

MyCarriera

Αίτηση