Our client is seeking a highly skilled and experienced Principal Data Engineer for a 6-month contract to play a crucial role in building a new, centralized data model. This is a unique opportunity to shape the future of the data landscape, working closely with a consultant to define the strategy and roadmap for syncing and centralizing data from legacy systems across the organization. You will be the technical authority on this project, responsible for establishing design patterns, identifying technical gaps and risks, and ultimately leading the hands-on development of the data model.
Key Responsibilities
- Design, build, and optimize large-scale Azure-based data pipelines and architectures.
- Develop and implement scalable ETL processes to extract, transform, and load data from multiple sources, with a focus on integrating legacy systems and using Change Data Capture (CDC) techniques.
- Design and maintain data warehouse solutions using Azure Synapse Analytics, Snowflake, or BigQuery.
- Lead the development and maintenance of a centralized data model, ensuring the integration of legacy systems with modern platforms, including the use of CDC for real-time data synchronization.
- Implement data modeling best practices to support analytics and reporting needs.
- Lead and mentor a team of data engineers, providing technical guidance and best practices while ensuring hands-on involvement in building key components of the data model.
- Collaborate with data scientists and analysts to deliver high-quality datasets for machine learning and business intelligence.
- Ensure seamless data integration across multiple platforms, optimizing data movement and storage for scalability.
- Ensure data quality, governance, and security across all data platforms.
- Optimize database performance and implement data storage solutions tailored for scalability.
- Drive automation and CI/CD practices in data pipeline development.
- Work with stakeholders to define data strategies and align them with business objectives.
- Ensure compliance with data privacy regulations such as GDPR, HIPAA, or CCPA.
Required Qualifications & Experience
- 10+ years of experience in data engineering, data architecture, or related fields.
- Strong proficiency in programming languages such as Python, Scala, or Java.
- Expertise in SQL and database technologies such as PostgreSQL, MySQL, or Azure SQL Database.
- Hands-on experience with Azure data services (Azure Data Factory, Azure Synapse Analytics, Azure Data Lake).
- Extensive experience with data warehousing, data modeling, and data integration, including integrating legacy systems into modern data architectures and implementing Change Data Capture (CDC).
- Strong knowledge of ETL tools (Apache Airflow, Talend, Informatica, dbt) and CI/CD pipelines.
- Familiarity with Azure DevOps, Kubernetes, Docker, and microservices for data engineering.
- Proven ability to optimize and troubleshoot large-scale data processing systems.
- Excellent leadership, communication, and problem-solving skills.
- Bachelor's or Master’s degree in Computer Science, Data Engineering, or a related field.
Preferred Qualifications
- Experience working with real-time streaming data solutions (Azure Event Hubs, Apache Kafka, Flink, Spark Streaming).
- Hands-on experience with machine learning data pipelines.
- Industry certifications (Azure Data Engineer Associate, AWS Certified Data Analytics, Google Cloud Professional Data Engineer, etc.).