Requirements
- -----------
### Must have:
### - We require a bachelors degree in Electrical and Computer Engineering, Computer Science, Information Technology, or a related discipline, plus 6 years of relevant experience. - Alternatively, we accept a masters degree in Electrical and Computer Engineering, Computer Science, Information Technology, or a related discipline, plus 4 years of relevant experience. - We need hands-on experience administering cloud-native services across platforms such as Kubernetes, virtual machines, networking components, and environments focused on high availability and security. - We need experience managing cloud infrastructure across multiple accounts or subscriptions, with emphasis on cost control, tagging, role-based access control, governance, and compliance enforcement. - We require experience applying DevOps and Site Reliability Engineering practices to strengthen development workflows and improve production reliability and availability. - We need experience designing alerting and observability approaches using SRE golden signal principles to track system health and performance. - We require experience supporting distributed systems and resolving production incidents in microservices-based environments. - We need experience configuring and monitoring telemetry for microservices using cloud-native tools such as CloudWatch, Monitor, Application Insights, and Log Analytics, depending on the platform. - We value proficiency with common DevOps technologies, including Docker, GitHub, Jenkins, Terraform, SonarQube, and JFrog. - We require familiarity with at least one APM platform, such as Datadog, Dynatrace, Splunk SignalFx, AppDynamics, or Azure Monitor. - We prefer scripting experience in Bash, Python, Groovy, or PowerShell for automation and operational tooling. - We expect a strong understanding of application and system architecture, including networking fundamentals, cloud concepts, and microservices design. - We require the ability to support a full-time schedule with on-site presence at our Northbrook, IL location 1 to 3 days per week.
Responsibilities:
- ----------------
- We create, refine, maintain, automate, and troubleshoot highly scalable, dependable, and complex systems. - We work closely with a skilled engineering team that delivers mission-critical infrastructure and software with strong availability, performance, and security. - We monitor and administer systems and infrastructure deployed both on-premises and in the cloud. - We install, configure, test, and maintain advanced technical platforms and system architectures. - We manage and support our Kubernetes platform, including deployments and services. - We use DevOps tooling such as Docker, GitHub, Jenkins, Terraform, SonarQube, and JFrog to improve delivery and operations. - We build simple to moderately complex scripts and programs to automate tools, frameworks, dashboards, and alerting. - We provide 24/7 on-call support, along with second- and third-level troubleshooting for operational issues. - We investigate and resolve day-to-day production problems to keep services stable and reliable.
- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Company:
- -------
We are Medline Industries, LP, and we are hiring Engineers for our Site Reliability team in Northbrook, Illinois. This is a full-time role with a Monday through Friday schedule, 8:00 a.m. to 5:00 p.m., and telecommuting is permitted with an expectation to work from our Northbrook site 1 to 3 days per week. The annual pay range for this position is $153,317.00. We offer a comprehensive benefits package that may include health insurance, life and disability coverage, 401(k) contributions, paid time off, and additional resources depending on hours worked. We are committed to fostering a workplace where everyone feels they belong and can grow, and we support diversity, inclusion, and equal opportunity for all qualified individuals.
- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
The Site Reliability Engineer (SRE) at TP-Link Systems Inc. plays a pivotal role in maintaining the reliability and performance of the company's cloud platforms. Key responsibilities include building observability for microservices, analyzing production risks, and ensuring compliance with security standards.
In this role, SREs are expected to create and maintain comprehensive technical documentation, which includes architecture diagrams and standard operating procedures. Additionally, they must be adept at troubleshooting issues related to resource allocation, such as CPU and memory management, to optimize system performance.
To be considered for the Site Reliability Engineer position, candidates must possess a strong foundation in programming and scripting languages, such as Java, Python, Bash, or PowerShell. Hands-on experience in SRE practices, DevOps methodologies, and cloud operations is essential for success in this role.
Moreover, familiarity with security technologies, including identity and access management, network security, and data protection, is crucial. Experience with container orchestration technologies like Kubernetes is also highly valued, as it enhances the candidate's ability to manage and deploy applications effectively.
TP-Link Systems Inc. prides itself on fostering a culture of professionalism, innovation, and excellence. The company is committed to creating an inclusive work environment that values diversity and encourages collaboration among its employees.
With a focus on continuous improvement, TP-Link invests in the personal and professional development of its workforce. This commitment is reflected in its various team-building events and health and wellness benefits, which aim to enhance employee satisfaction and productivity.
The compensation package for the Site Reliability Engineer position at TP-Link Systems Inc. is competitive, with a salary range of $100,000 to $140,000. In addition to a robust salary, employees enjoy a range of benefits, including fully paid medical, dental, and vision insurance, along with contributions to 401k funds.
Other perks include free snacks and drinks, catered lunches on Fridays, and health and wellness benefits such as gym memberships. Employees also benefit from bi-annual performance reviews and annual pay increases, ensuring that their contributions are recognized and rewarded.