At EY, you’ll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we’re counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all.
Job Description: Data Engineer
Role Overview:
We are seeking a highly skilled and experienced Data Engineer with expertise in Python data engineering, PostgreSQL, and Apache Airflow. The ideal candidate will have a minimum of 5 - 8 years of experience in data engineering and will play a crucial role in designing, implementing, and maintaining data pipelines and orchestration. You will work closely with data scientists, analysts, and other stakeholders to ensure efficient and reliable data processes.
Responsibilities:
- Design, develop, and maintain scalable data pipelines using Python, PostgreSQL, and Apache Airflow.
- Collaborate with data scientists and analysts to understand data requirements and implement solutions.
- Optimize and tune database queries for performance and scalability.
- Integrate Kafka into data pipeline for asynchronous data processing.
- Ensure data quality and integrity through robust validation and monitoring processes.
- Develop and implement data pipeline framework to integrate data from various sources.
- Monitor and troubleshoot data pipelines, ensuring timely resolution of issues.
- Implement best practices for data engineering, including code versioning, testing, and documentation.
- Work with on-premises platforms for CI/CD process to deploy and manage data pipeline.
- Collaborate with software engineering teams to integrate data solutions into applications.
- Stay updated with the latest trends and technologies in data engineering and PostgreSQL database for performance improvement of vector database.
Requirements:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Minimum 5 -8 years of experience in data engineering.
- Strong proficiency in Python programming for data engineering tasks.
- In-depth knowledge of PostgreSQL and experience in database design, optimization, and management.
- Hands-on experience with Apache Airflow for orchestrating data workflows.
- Experience with data pipeline processes and tools.
- Experience with kafka for data processing.
- Familiarity with DevOps and CI/CD including docker containerization.
- Strong problem-solving and analytical skills.
- Excellent communication and collaboration skills.
- Understanding of data privacy, security, and compliance considerations.
Good to Have Skills:
- Experience with other storage technologies (e.g., Dell’s S3).
- Experience with DevOps practices and tools (e.g., Docker, Kubernetes, Terraform).
- Understanding of MLOps and integrating data engineering with machine learning workflows.
- Ability to work in an Agile development environment.
EY | Building a better working world
EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets.
Enabled by data and technology, diverse EY teams in over 150 countries provide trust through assurance and help clients grow, transform and operate.
Working across assurance, consulting, law, strategy, tax and transactions, EY teams ask better questions to find new answers for the complex issues facing our world today.