Data Engineer Position
The purpose of this role is to provide data analysis and reporting tools for the regulatory reporting team or assimilate (Compliance, top management, IG, Business) on an ad hoc and regular basis using a variety of tools and
programming skills.
Main Responsibilities:
• Responsible for maintaining the infrastructure that supports the current data architecture
• Responsible for creating data pipelines in Airflow for data extracting, processing and loading
• Responsible for data pipelines maintenance, monitoring and stability
• Responsible for providing data access to data analysts and end-users
• Responsible for DevOps infrastructure
• Responsible for deploying Airflow dags to production environment using DevOps tools
• Responsible for code and query optimization
• Responsible for code review
• Responsible for improving the current data architecture and DevOps processes
• Responsible for delivering data in useful and appealing ways to users
• Responsible for performing and documenting analysis, review and study on specified regulatory topics.
• Responsible for understanding business change and requirement needs, assessing the impact and the cost
Technical skills:
• Advanced Python- (Mandatory)
• Experience in creating APIs in Python - At least Flask (Mandatory)
• Experience in documenting and testing in Python (Mandatory)
• Advanced SQL skills and relational database management (Oracle is Mandatory, SQL server is desirable, PostgreSQL is desirable )
• Experience with Data Warehouses
• Hadoop ecosystem - HDFS + Yarn
• Spark Environment Architecture (Mandatory)
• Advanced PySpark - (Mandatory)
• Experience in creating and maintaining distributed environments using Hadoop and Spark
• Data Lakes - Experience in organizing and maintaining data lakes
• Experience with Parquet file format (Avro is a plus)
• Apache Airflow - Experience in both pipeline development and deploying Airflow in a distributed environment (Mandatory)
• Containerization - Docker
• Kubernetes
• Apache Kafka
• Experience in automating application deployment using DevOps tools (Jenkins is Mandatory, Ansible is a plus)
• Agile methodologies (At least SCRUM)
• Fluent in English (Mandatory)