Job Description
Our valued Federal Government client is currently seeking a Data Engineer for a 1+ year remote contract opportunity. The Data Engineer will gather and harmonize data sets to meet the project's business needs. Ensure the database infrastructure is robust, flexible, and secure across multi-model environments (relational, document, graph, vector, etc.), ensuring efficient data flow, consistency, and adherence to security and privacy protocols.
Roles and Responsibilities
- Database Management and Integration: Engage with databases (relational, document, graph, vector) like Neo4j, TerminusDB, and SurrealDB for data structuring, linking, retrieval, and integration with systems like Directus and vector databases (e.g., Weaviate).
- ETL Systems: Design ETL systems to manage complex processing of regulations, including schema creation, data transformation, and storage solutions specific to regulatory data.
- Data Pipeline Development: Construct reliable pipelines using tools like Spark, Hop, and Kettle, for streamlined data operations
- Data Preprocessing: Implement data cleaning and normalization techniques to improve data quality and usability, preparing data for data scientists and other users.
Qualifications and Skills
- 8+ years of experience in Data Engineering
- 8+ years of expereince utilizing Python
- 5+ years of experience in data pipeline development using tools such as Spark/Hop/Kettle
- 3+ years of experience in deploying and maintaining data architectures on cloud services like AWS/Docker/Kubernetes
Education
A relevant degree/diploma is required