Posted 1 month ago
Secret
$160,000 - $200,000
IT - Data Science
Washington, DC (On-Site/Office)
Data Pipeline Reliability Engineer
Location: Washington, D.C.
Job Description:
Serve as a Data Pipeline Reliability Engineer (DPRE) on a cross-functional team
Help to ensure our customers’ missions are supported with updated and accurate data
Build, optimize, and maintain data pipelines to improve their efficiency and resilience.
Serve as a first responder
Triaging, troubleshooting, and coordinating the resolution of technical issues
Diagnose, resolve, and prevent issues encountered in the field
Implement/maintain automated monitoring to detect data quality issues
Embed with business teams to minimize risks associated with product deployments
Collaborate with customer-facing teams to increase reliability of data pipelines
Improve performance/stability of production data pipelines by:
Installing data health metrics and automated alerts
Document strategies for responding to incidents
Qualifications:
Strong engineering background
Proficiency with programming languages such as Java, C++, Python, or JavaScript
Basic parallel data processing experience
Basics understanding of optimizing Spark jobs
Experience performing root cause analysis and documentation of findings
Understanding/experience with data concepts such as:
Data warehousing
Data Lakes
Data governance
Data Liniage
Understanding of networking concepts (DNS, VPNs, Load Balancing)
Experience with the following tools:
Observability tools (Ex. Grafana)
Data Pipeline tools (Ex. Airflow)
Cloud tools (Ex: AWS, Azure, Google Cloud)
IaC tools (Ex. Terraform)
Required: Active Secret Security Clearance
Location: Washington, D.C.
Job Description:
Serve as a Data Pipeline Reliability Engineer (DPRE) on a cross-functional team
Help to ensure our customers’ missions are supported with updated and accurate data
Build, optimize, and maintain data pipelines to improve their efficiency and resilience.
Serve as a first responder
Triaging, troubleshooting, and coordinating the resolution of technical issues
Diagnose, resolve, and prevent issues encountered in the field
Implement/maintain automated monitoring to detect data quality issues
Embed with business teams to minimize risks associated with product deployments
Collaborate with customer-facing teams to increase reliability of data pipelines
Improve performance/stability of production data pipelines by:
Installing data health metrics and automated alerts
Document strategies for responding to incidents
Qualifications:
Strong engineering background
Proficiency with programming languages such as Java, C++, Python, or JavaScript
Basic parallel data processing experience
Basics understanding of optimizing Spark jobs
Experience performing root cause analysis and documentation of findings
Understanding/experience with data concepts such as:
Data warehousing
Data Lakes
Data governance
Data Liniage
Understanding of networking concepts (DNS, VPNs, Load Balancing)
Experience with the following tools:
Observability tools (Ex. Grafana)
Data Pipeline tools (Ex. Airflow)
Cloud tools (Ex: AWS, Azure, Google Cloud)
IaC tools (Ex. Terraform)
Required: Active Secret Security Clearance
group id: 90942178