Site Reliability Engineer

Tech Stack

GCP
KUBERNETES
DATADOG/DYNATRANCE
GRAFANA
PROMETHEUS
VICTORIAMETRICS
TERRAFORM

Job Description

InPost Group is an innovative European out of home deliveries company, revolutionizing the way parcels are delivered to customers.

With operations across several countries, our network of intelligent lockers provides customers with a fast, convenient, and secure delivery option.

InPost Group is a publicly traded company, with a market capitalization of about $5 billion as of March 2023. With over 10,000 employees worldwide, InPost Group is one of the largest out of home delivery providers in Europe, committed to providing sustainable and efficient delivery solutions to meet the evolving needs of customers in today's rapidly changing landscape.At InPost, we’re passionate about building software that helps our customers sending and receiving their goods.

We count on our site reliability engineers (SREs) to empower users with a rich feature set, high availability, and stellar performance level to pursue their missions.

On our way to the cloud, we’re seeking an experienced SRE to help in the transformation while delivering insights from massive-scale data in real time.

Specifically, we’re searching for someone who has profound experience when it comes to run applications within the cloud, having fresh ideas and a unique viewpoint, and who enjoys collaborating with a cross-functional and international team to develop real-world solutions and positive user experiences for every interaction.Objectives of this role: Help development teams in delivering reliable and well performing software.Improve reliability, quality, and time-to-market of our suite of software solutions.

Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement.

Provide primary operational support and engineering for multiple large-scale distributed software applications.

Transform and design the Infrastructure for the Cloud and privately hosted application.Build and maintain tooling to speed up the deployment lead time (quality assurance, monitoring, observability).

Run the production environment by monitoring availability and taking a holistic view of system health using modern APM tools.

Build software and systems to manage platform infrastructure and applications.

Responsibilities: Rightsizing and finops in cloud and on-prem.

Gather and analyze metrics from all systems (operating systems, applications, PaaS like Google Cloud SQL) to assist in performance tuning and fault finding.

Partner with development teams to improve services through rigorous testing and release procedures.

Participate in system design consulting, platform management, and capacity planning.

Create sustainable systems and services through automation and uplifts.

Balance feature development speed and reliability with well-defined service-level objectives.

Assist local Ops, Administrators and Networking teams.Develop simple applications to help with process automation.Qualifications:Bachelor’s degree (or equivalent or proven experience in the field) in computer science or related discipline.

Ability to program (any structured or OOP language) using one or more high-level languages, such as Python, Go, Java, C/C++, Ruby, and JavaScript.

Experience with distributed storage technologies such as Google Cloud Storage, as well as dynamic resource management frameworks, especially Kubernetes.

Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.

Knowledge of Google Cloud (GCP) and it’s services like IAM, VPC networking, Cloud Run and others.

Experienced in using Terraform to create, maintain and test infrastructure.

Familiarity with tools like Dynatrance/Datadog, Grafana, Prometheus, and VictoriaMetrics.

Advanced English speaking ability (C1 or better).Ability to communicate in French (B2 or better) will be considered as a big asset!