What are we looking for:

Technical Skills:

Data Engineering and ETL:

    • Basic understanding of data pipelines, ETL processes, and data warehousing concepts
    • Proficiency in SQL with ability to write and optimize queries
    • Familiarity with big data technologies and distributed computing concepts

    Programming and Tools:
    • Strong programming skills in Python, including experience with data manipulation libraries (e.g., Pandas, NumPy)
    • Basic understanding of cloud environments (GCP, AWS, or Azure), with hands-on experience a plus
    • Familiarity with both relational (RDBMS) and non-relational (NoSQL) databases
    • Basic knowledge of version control systems (e.g., Git)

    Data Processing and Analysis:
    • Experience with data preprocessing, cleaning, and validation techniques
    • Basic understanding of data modeling and dimensional modeling concepts
    • Exposure to data quality assessment and data profiling tools

      Preferred Technical Skills:

      • Exposure to data visualization tools (e.g., Tableau, Power BI, or Metabase)
      • Familiarity with dbt (data build tool) or similar data transformation tools
      • Basic understanding of containerization (e.g., Docker) and orchestration concepts
      • Basic knowledge of data streaming concepts and technologies (e.g., Kafka)
      • Familiarity with APIs and web services
      • Exposure to Agile development methodologies

      Soft Skills:

      • Strong problem-solving skills and attention to detail
      • Excellent communication abilities, both written and verbal
      • Ability to collaborate effectively in cross-functional teams
      • Self-motivated with a strong desire to learn and adapt to new technologies
      • Basic project management skills and ability to manage time effectively
      • Analytical thinking and ability to translate business requirements into technical solutions
      • Passion for staying updated with the latest trends in data engineering and cloud computing

      What will you do:

      • Assist in building and maintaining data pipelines and warehouses, focusing on developing ETL/ELT processes using both traditional methods and modern tools like dbt (data build tool).
      • Collaborate with data scientists, analysts, and other team members to support their data needs, translating business requirements into technical solutions.
      • Participate in implementing automated documentation processes, potentially utilizing AI tools to generate and maintain up-to-date documentation for data pipelines and models.
      • Assist in optimizing SQL queries, database performance, and data architecture, providing recommendations for improvements in scalability and efficiency.
      • Help implement data governance practices, including data quality checks, security measures, and metadata management to ensure compliance and improve data discovery.
      • Contribute to the exploration and implementation of new data technologies, participating in code reviews, and supporting the team's continuous learning and improvement processes.