Data Engineer - Niche System integrator Amsterdam

Job description

Indigo is recruiting a Data Engineer for a niche system integrator in Amsterdam. You will be part of a company with 270 employees of which 250 Devops and Software engineers. As the numbers indicate, the company is run by engineers. The company is divided in customer teams that build, refactor and maintain applications for a range of companies - varying from financial scale-ups (fintech) to large online retailers and corporates in the energy sector. 


As a Data Engineer you are part of a multi-disciplinary team who will get to actively explore, design, build and run data lakes in one of the public clouds. The teams are completely self-steering, they have no managers, and every team member is in direct contact with the customer. Colleagues that work with on different projects or with different customers will come to you and the team for knowledge and expertise.


The company keeps their customers moving forward by designing, building and running their data landscapes. These landscapes are often developed in public cloud environments, but there is diversity in implementation: no customer is the same. They have data lake environments on our own MCC (Mission Critical Cloud), AWS, Azure and more.  


The current customer case in which you will be onboarded is a great one. You will work with one of The Netherlands' greatest beer breweries to design, implement and optimize a IoT/data solution. The PLCs that run the factory equipment are connected with an IoT network (using Raspberry Pi's). You will be responsible with your data engineering team to build the data pipelines / ETL/ Datalakes (AWS Greengrass).


You will be taking on various cases, as well as finding answers to questions like these; 

  • What data quality rules are there and how should they be implemented? 
  • Which storage solution(s) is/are required? 
  • How and with what tooling will we do ETL? 
  • What ingestion types are required? 
  • How will governance of the entire landscape be done? 
  • What is required to train my ML model effectively? 
  • … and many more


Requirements

What kind of experience do we look for?

  • Developing and maintaining data pipelines. 
  • Comfortable with DevOps-driven teams. 
  • Configuring and managing tools like Spark, Glue, Kinesis, S3, Lambda, Kafka, Redshift, Athena, EventHub, etc. 
  • Datalake experience
  • Great (not required) if you have AWS Greengrass experience (https://aws.amazon.com/greengrass/)
  • Supporting tooling such as Gitlab (CI/CD), Ansible (Automation), Terraform (Infra)
  • Affinity with scripting languages, preferably Python. 
  • Ability to self-organize; you are in control of your destination, there is no management. 
  • Capable of communicating proficiently with a customer that has various levels of expertise.