4 Site Reliability Engineer jobs in Egypt
Site Reliability Engineer (Sre)
Posted today
Job Viewed
Job Description
In Taager, the tech team is fully remote, distributed through multiple continents, and working on a flexible time schedule. It aims to allow everyone to grow and improve while respecting everyone’s work/life balance.
**Our Top 6 Reasons to Work for Taager**:
- You will work remotely, and with flexible hours.
- We offer above-market salaries and lucrative stock options.
- We invest in our people; we offer to learn programs to elevate our team's skills and performance.
- We offer unlimited vacations, and birthday and mental wellness vacations.
- We offer a comprehensive Medical Insurance package.
- Most importantly, you will be working with brilliant, ambitious, and caring co-workers from different countries across the world.
**About the role**:
**Responsibilities**:
- Work on managing and keeping the cost of infrastructure under control.
- Contribute to and continually improve our site architecture, feature components, development process, SDLC tools, and system design.
- Research new technologies or methodologies that can improve the architecture, performance, cost optimization, or development process.
- Proactively diagnose problems identified in production and recommend solutions.
- Support development environments to help us achieve our delivery and quality goals.
- Research and evaluate technologies, tools, and services to influence buy-vs-build decisions.
**_
Must Have:_**
- Strong understanding of DevOps culture and best practices.
- Deep understanding of software lifecycle.
- Aware of agile concepts.
- Understanding of OS.
- Good Understanding of Internet Infrastructure (HTTP/DNS.etc)
- Experience in designing and developing CI/CD pipelines
- Experience in scripting (Bash or Python)
- Experience in version control
- Experience in IaaC.
- Experience in Configuration as Code.
- Experience in designing and building DevOps solutions.
- Experience in containerization (kubernetes, docker and helm charts)
- Experience in any Cloud provider (AWS, GCP or Azure)
**_ Nice To have :_**
- Extensive Experience in AWS.
- Extensive Experience with Kubernetes.
- Extensive Experience with Terraform
- Extensive Experience with Canary Deployment.
- Extensive Experience with Monitoring solutions.
Senior Site Reliability Engineer I
Posted today
Job Viewed
Job Description
**About the team**
We are looking for engineers who will work within the Cloud Infrastructure Foundation team. The Infra Foundation team develops and maintain cloud-native technology for the Careem Service teams:
- Highly scalable Kubernetes clusters
- Highly reliable, secure and performant KONG based API Gateways
- Cloud Access management automation and integration with k8s
**About the role**
We need expert, execution-focused engineers to help shape the future of the Careem platform and to help us scale our already sizable effort greatly. As an SRE in Careem, you'll architect, build and maintain the above ecosystem required to ensure resilience, reliability of our services and speed up deployments with the aim of improving our products used by millions of customers every day. Key responsibilities include:
- Make an impact from design phase, through development and operation of Kubernetes cluster and its ecosystem on AWS
- Develop and integrate KONG API Gateway Plugins that are low latency and secure.
- Build core services, tooling and create technical processes that simplify and enable engineers across multiple services
- Identifying and automating and scale system configurations without compromising on security and reliability.
- Participate in on-call rotations and help improve incident response
**Qualifications**
- 3 above years experience in architecting, developing, operating and troubleshooting Kubernetes clusters and/or other highly available systems at scale.
- Preferable - hands on experience with deploying and operating KONG as API GW or Ingress controller.
- Experience with any at least one of the following programming languages: Go, Python, Java, Rust, C++
- Experience with infrastructure automation - such as terraform, Cloud Formation or Pulumi.
- Strong Unix or Linux background, including concepts such as processes, network stack, and memory allocation
- Experience with cloud-native services on AWS/GCP/Azure
- Incident response and/or incident management experience
- Experience on DevOps topics such as monitoring, CI/CD, security is a plus
- Effective communication and collaboration skills: have the ability to drive and promote technical partnerships across teams
**What we’ll provide you**
We offer colleagues the opportunity to drive impact in the region while they learn and grow. As a Careem colleague you will be able to:
- Work and learn from great minds by joining a community of inspiring colleagues.
- Put your passion to work in a purposeful organisation dedicated to creating impact in a region with a lot of untapped potential.
- Explore new opportunities to learn and grow every day.
- Enjoy the flexibility that comes with the trust of being an owner; work in a hybrid style with a mix of days at the office and at home, and remotely from any country in the world for 30 days a year with unlimited vacation days per year.
- Access to healthcare benefits and fitness reimbursements for health activities including: gym, health club and training classes.
Senior Site Reliability Engineer (1025us)
Posted today
Job Viewed
Job Description
Our team is over 50 people including web (C#/.NET, Java, JS) & mobile (iOS/Android/Ionic) developers together with business analysts, project managers, QA, and support staff. Our corporate culture is characterized by agile processes, autonomous teams without hierarchies, as well as openness and transparency - both internally and with our clients. Currently, we are searching for Senior Site Reliability Engineer to join the big team of professionals in Cairo. We are looking for an active, responsive, and devoted person.
**The Work**:
- Plan, design, and implement seamless VCF upgrades of infrastructure and services through automation
- Plan and Implement integration of extensible services and platforms of VCF
- Ensure compliance of the services with high security standards including inventory and access control monitoring and reporting
- Maintaining High Availability, Stability, and Reliability of Services
- Maintaining optimization by implementing & deploying Automated Solutions
- Develop and maintain up-to-date, clear, and effective service operations
- Work closely with development teams to improve the maintainability and reliability of services
- Own and provide proactive service support by participating in regular business calls
- Identify, gather, analyze, and automate responses to key performance metrics, logs, and alerts
- Understanding of various 3rd party integrations with VMware Cloud Foundation (VCF) and highlighting any interoperability issues
- Define project requirements/issues/constraints with PMs / SMs
- Take ownership and accountability for individual deliverables.
- Cover a 24/7 rotational shift pattern if required
**Requirements**:
1. Strong Linux Sys admin experience
2. Strong networking and IT infrastructure knowledge
3. Good knowledge of cloud technologies (and VMware VCF if possible)
4. Good scripting/coding skills. Any OOP and scripting. Scripting knowledge preferable Bash (Linux), TypeScript, JS (JavaScript),
Or/and any OOP (Objective Programming) language like Python.
**We offer**:
- Financial stability
- Interesting and challenging projects within professional self-managed teams
- Friendly team and a comfortable working environment.
- Flexible schedule (8-10 AM start) with the possibility to work assigned hours and/or adjust the work schedule as requested by the manager
- 21 working days of paid annual vacation.
- Health insurance.
- Social insurance—the highest level
- Paid sick leave.
- Performance review after half the year
Why You Should Work With Us:
We work as a self-driven team without complex management structures. Our teams make independent decisions without recommendations from the client. We nurture an open, transparent environment where we all enjoy our work.
Head of Cloud Infrastructure Engineering, Fintech
Posted today
Job Viewed
Job Description
Levelset, a Procore company, is building the software that empowers people in construction to get what they earn. We provide cloud-based construction management software that makes lien rights, payment paperwork, and compliance in the construction industry simple and stress-free, so contractors and suppliers can get paid faster, have easier access to capital, and less surprises.
**Job Description**:
We’re looking for a **Head of Cloud Infrastructure Engineering** to join Procore Fintech.
Our mission at Procore Fintech is to improve access to working capital and levelset risk in construction. Our group builds financial products powered by risk data to improve contractors access to capital and insurance products. We are using big data technologies and our proprietary risk data graph to transform how financial risk data can power construction fintech products. We run software and data systems that ingest, process, and organize data from hundreds of sources, including licenses, permits, liens, and hundreds of other signals sources. In addition, our suite of products and services and thousands of existing users provide us with a wealth of construction payment data that can be used to enrich our risk data graph and inform our data-powered construction risk products.
This role reports directly to the Chief Technology Officer of Procore Fintech. We’re looking for someone to join us immediately.
**What you’ll do**:
- Maintain and improve reliability, availability, and scalability of all existing cloud infrastructure systems
- Maintain and develop our CI/CD systems
- Lead our on-call devops support engineering team to address infrastructure incidents and provide timely infrastructure operations support to other teams
- Manage and expand our growing cloud infrastructure engineering team
**What we’re looking for**:
- Degree in Computer Science or related discipline; or comparable work experience
- 10+ years of software engineering experience
- 5+ years of people management experience
- Experience designing, building, and maintaining distributed systems; experience optimizing for reliability of distributed systems
- Experience in modern infrastructure and cloud operations technologies (e.g. AWS, Kubernetes, Jenkins, Docker, Terraform)
- Familiarity with big data systems and data streaming technologies (e.g. Kafka, RabbitMQ, Elastic, Neo4j, MongoDB, Airflow, Spark, Databricks)
- Can write high-quality code in several programming and scripting languages
Additional Information
If you'd like to stay in touch and be the first to hear about new roles at Procore, join our Talent Community.
**About Us**
Procore Technologies is building the software that builds the world. We provide cloud-based construction management software that helps clients more efficiently build skyscrapers, hospitals, retail centers, airports, housing complexes, and more. At Procore, we have worked hard to create and maintain a culture where you can own your work and are encouraged and given resources to try new ideas. Check us out on Glassdoor to see what others are saying about working at Procore.
We are an equal opportunity employer and welcome builders of all backgrounds. We thrive in a diverse, dynamic, and inclusive environment. We do not tolerate discrimination against employees on the basis of age, color, disability, gender, gender identity or expression, marital status, national origin, political affiliation, race, religion, sexual orientation, veteran status, or any other classification protected by law.
Be The First To Know
About the latest Site reliability engineer Jobs in Egypt !