LifeRaft (Social Navigator Inc.) is a company based out of Halifax, Nova Scotia, that provides a SaaS-based threat intelligence platform to security and business risk teams in the corporate/commercial sector.
We are seeking a Site Reliability Engineer with strong DevOps skills to join our team. In this role, you will be ensuring the stability, scalability and performance of our operations.
We are looking for proactive individuals that will take ownership of their work, proactively overcome roadblocks while contributing to a world class platform.
If you want to work with a dynamic, fast-growing team with an innovative spirit and a determination to help solve new challenges developed by the complexities of open source data – this could be the opportunity for you.
Responsibilities / What you’ll be doing:
- Apply your software/programming expertise, combined with your in-depth understanding of system and cloud services in order to continuously improve our infrastructure and products
- You will be responsible for the uptime, performance and operational cost of our cloud platform and SaaS based product. You will make daily and weekly operational decisions with the goal of improving uptime while reducing cost. You will drive improvements by being familiar with the latest emerging trends in the cloud and SaaS technologies. All your decisions will be focused on providing the best in class service to the users of our SaaS products.
- You will be responsible for monitoring the performance, latency and scalability of our data collection/processing systems
- Responsible for cloud platform production infrastructure.
- Respond to production incidents; receiving on-call notifications of service distributions and coordinating with development teams to resolve outages.
- Create and share incident post-mortems.
- Set up and maintain observability infrastructure.
- Maintain service level objectives.
- Monitor and improve service performance and scalability.
- Reduce MTTR and False Positive rates.
- Ensure and report on GRC compliance.
- Reduce toil (automate, automate, and automate some more).
- You will ensure effective monitoring of key system metrics, perform capacity planning to keep systems running efficiently and have a response plan in place for incidents.
- You will be responsible for maintaining and testing our Disaster Recovery and Business Continuity plans.
- You will be responsible for the overall security of the system (networking, firewall, credentials)
- Responsible for change management, dependency management, helping manage and support code repositories, deployment pipeline/integration.
- Aggregate and review logs on a regular basis, taking action when necessary
Qualifications / What we’re looking for
You would be a great fit for this opportunity if you have skills as a Site Reliability Engineer or DevOps and strong communication skills in addition to the following criteria:
- 3+ years of hands-on experience in administering large-scale SaaS applications in one of the major cloud platforms (AWS, GCP…)
- In-depth knowledge and 3+ years of hands-on experience with Linux, networking, firewalls, VPNs
- Fluent in scripting with 3+ years of experience in a popular coding language (Python, PHP, …)
- Experience with Docker
- Experience in automating cloud system monitoring and operations with scripting
- Familiarity with the tools of the trade : experience of code review systems, issue tracking tools, build tools, test frameworks, code quality tools, CI systems, and IDEs
What’s in it for you
Impact: Navigator advances corporate security and is designed to identify, track, and validate issues from open source channels (surface, deep web, and darknet) related to executive safety, fraud prevention, and & infrastructure protection.
Flexibility: The LifeRaft team has employees working across Canada and around the world from the comfort of their own home. That being said, you are also welcome to work from the Halifax office, with the rest of the team… and their office dogs!
Perks: Our client offers a comprehensive benefits plan, a fun & informal office culture, free parking and above average office-supplied coffee.
Challenge: Your work is solving real-world problems, and is bridging the gap between physical and digital security challenges.
- Office dogs (when we aren’t working remotely)
- Flexible hours
- Partial phone plan coverage
- Medical, Dental and Vision Coverage
- Paid Time Off
- Paid Parental Leave
- Great office parties
- Hilarious co-workers
- Diversity & inclusion committee!
To apply, send your resume and cover letter to firstname.lastname@example.org
View our other job postings here