Junior Software Engineer (Web Scraping)

Job description

Daltix is a fast-growing, successful, data-driven start-up from Belgium with offices in Boom, Ghent and Lisbon. We are bringing real-time insights to the world of retail. How we are doing that? We have developed a set of tools to gather, process and analyze retail data and to bring this all together in our data platform we offer to our clients. So far, we have had great success in doing so. That’s why today Daltix serves customers such as Makro, Lidl, Dreamland, Jumbo, Greenyard,, Unilever and others.


Sign me up! No wait... What will I be doing?

We are looking for a talented Junior Software Engineer (Web Scraping) who is excited to help build and maintain the distributed data collection system (the heart of our business, literally) in our office in Lisbon. As such, you would have the opportunity to be in the front-lines, facing massive (but interesting!) challenges as we try to scrape all retail data available. We are a data-driven company which collects and processes more than 600GB of raw data (HTML) daily. We leverage big data technologies such as Serverless, Spark on AWS EMR to crunch these volumes of data and make it queryable.

 

In this role you will ensure that our data collection engine, which consists of distributed web crawlers, is state of the art and ahead of our competition. You will ensure that we can scrape any webshop, no matter the ban-detection that has been put in place. Next to that it will be important that the proper monitoring tools are in place. We are currently scraping 60 sites and your goal is to at least triple that without losing completeness and quality. 


Your typical tasks:

  • Developing & maintaining web crawlers using Python & JavaScript.
  • Design & developing internal tools and frameworks.
  • Design & develop tools to automate testing & QA.
  • Building & maintaining distributed data collection systems on AWS.
  • Guarantee data integrity and quality by extending our logging, monitoring and outlier detection systems.

About the Stack:

  • This distributed system is made on top of Amazon Web Services and uses Serverless architectures where possible, with Python & Javascript being the main programming languages used.
  • As Daltix scales from 50+ websites to 200+ websites (which it scrapes multiple times per day!) it has to invest in orchestration technologies such as Kubernetes as well as logging & monitoring solutions to keep an overview at scale.

Requirements

Why you're the one, we are dying to meet:

  • You have wide-knowledge of computer engineering (e.g through a CS masters, home projects, ...).
  • You have strong programming experience with one or more languages: Python (preferred), Javascript, C#, Java, Scala, C/C++,...
  • You have a good understanding of REST API's, databases and SQL
  • You have experience in Linux command-line
  • Knowledge of the AWS stack is nice to have
  • You are fluent in spoken and written English
  • You have excellent problem-solving capabilities and a critical mindset
  • You are passionate about software engineering
  • You get energy from working in a highly complex and challenging startup environment with a high tech product
  • You are based in Lisbon (Portugal) or willing to relocate

If you still need more convincing…

  • We are a young, entrepreneurial and fast-growing company; you will have the unique opportunity to shape our future and have a positive impact on our clients’ business

  • You will be offered a competitive wage in a talented, international team of top-notch experts

  • Flexible work arrangements with a lot of autonomy in what you do and where you do it, and home office -  We trust you to know your schedule and work when you feel most productive

  • You will be able to participate in relevant trainings to stay at the top of this field. Be part of an interesting and dynamic start-up and enjoy the scaling process.

  • An open company culture where we play as hard as we work

  • A cosy office in the heart of Lisbon

  • Health Insurance coverage

  • Fresh fruit, snacks, tea and coffee on the house

  • Occasional drinks and team events

  • You’ll get the chance to meet and work with industry professionals and help lift the company to the next level

Join us on the path to becoming the Google of retail and change the way retailers and suppliers work.