Data Engineer
Location: DC Metro area
Summary
We are looking for a candidate with 5+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
Task and Responsibilities
- Identify user requirements across the Agency and its affiliates and use them to create products that foster a culture of data-informed decision making at the agency
- Begin development of data pipelines, data storage architecture, web-based user interfaces, and data visualizations
- Assemble large, complex data sets that meet functional / non-functional business requirements
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using Microsoft Azure ‘big data’ technologies
- Build analytics tools to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs
- Create data tools for analytics and data scientist team members that assist them in building and optimizing new products and research
- Work with data and subject matter experts to develop greater functionality in our data systems
- Support the maintenance of data storage by establishing protocols, user interface, and client education
- Analyze and interpret data to provide insights into the Agency’s operations
- Build systems and methodologies to deliver marketable products that highlight organizational assets to internal stakeholders, external media partners, government agencies and Congress
- Contribute to building measurement models and indexing systems for reporting
- Develop custom data models and algorithms to apply to data sets
- Use statistical computer languages to manipulate data and draw insights from large data sets
- Contribute to and help maintain data structure for OPR Research
- Develop processes and tools to monitor and analyze model performance and data accuracy; identify data analysis problems, where insights will be useful to Agency and networks
- Develop algorithms to mine and analyze data from databases to drive optimization and improvement of product development, marketing and business strategies
- Develop metadata standards for the Agency’s Research and other data
Qualified Requirements
- 5 years progressive experience and demonstrated progressive responsibility in the ability to actively follow and apply learnings from technology industry, digital best practices, and associated audience behaviors
- Commitment to a culture of learning, curiosity, and open-debate
- Ability to efficiently lead team projects, supervise, and delegate to specialists
- Ability to supervise, work alongside, and coordinate with other developers using digital tools such as Microsoft Teams, Github, Azure DevOps, and others
- Skill in oral and written communications (English) to interface with individuals at all levels, both within and outside the government; develop, negotiate and defend recommendations; present briefing on sensitive and controversial subjects; and provide guidance and instructions
- Comprehensive knowledge of a wide range of qualitative and quantitative state-of-the-art analytical and evaluation methods and techniques and a demonstrated track record of a continuously growing skill set
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases
- Experience building and optimizing ‘big data’ data pipelines, architectures, and data sets, including API integrations from external and internal sources (social media, news, etc.)
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
- Strong analytic skills related to working with unstructured datasets
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- A successful history of manipulating, processing and extracting value from large disconnected datasets
- Working knowledge of message queuing, stream processing, and highly scalable data stores
- Strong project management and organizational skills
- Experience supporting and working with cross-functional teams in a dynamic environment
- Must have experience using the following software/tools:
-
- Experience with big data tools: Hadoop, Spark, Kafka, etc.
- Experience with relational SQL and NoSQL databases
- Experience with data pipeline and workflow management tools: Azure Data Factory, Azkaban, Luigi, Airflow, etc.
- Experience with Azure cloud services: ADLS Gen2, VM, etc.
- Experience with stream-processing systems
- Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.