What are we looking for:
Technical Skills:
Data Engineering and ETL:
• Basic understanding of data pipelines, ETL processes, and data warehousing concepts
• Proficiency in SQL with ability to write and optimize queries
• Familiarity with big data technologies and distributed computing concepts
Programming and Tools:
• Strong programming skills in Python, including experience with data manipulation libraries (e.g., Pandas, NumPy)
• Basic understanding of cloud environments (GCP, AWS, or Azure), with hands-on experience a plus
• Familiarity with both relational (RDBMS) and non-relational (NoSQL) databases
• Basic knowledge of version control systems (e.g., Git)
Data Processing and Analysis:
• Experience with data preprocessing, cleaning, and validation techniques
• Basic understanding of data modeling and dimensional modeling concepts
• Exposure to data quality assessment and data profiling tools
Preferred Technical Skills:
- Exposure to data visualization tools (e.g., Tableau, Power BI, or Metabase)
- Familiarity with dbt (data build tool) or similar data transformation tools
- Basic understanding of containerization (e.g., Docker) and orchestration concepts
- Basic knowledge of data streaming concepts and technologies (e.g., Kafka)
- Familiarity with APIs and web services
- Exposure to Agile development methodologies
Soft Skills:
- Strong problem-solving skills and attention to detail
- Excellent communication abilities, both written and verbal
- Ability to collaborate effectively in cross-functional teams
- Self-motivated with a strong desire to learn and adapt to new technologies
- Basic project management skills and ability to manage time effectively
- Analytical thinking and ability to translate business requirements into technical solutions
- Passion for staying updated with the latest trends in data engineering and cloud computing
What will you do:
- Assist in building and maintaining data pipelines and warehouses, focusing on developing ETL/ELT processes using both traditional methods and modern tools like dbt (data build tool).
- Collaborate with data scientists, analysts, and other team members to support their data needs, translating business requirements into technical solutions.
- Participate in implementing automated documentation processes, potentially utilizing AI tools to generate and maintain up-to-date documentation for data pipelines and models.
- Assist in optimizing SQL queries, database performance, and data architecture, providing recommendations for improvements in scalability and efficiency.
- Help implement data governance practices, including data quality checks, security measures, and metadata management to ensure compliance and improve data discovery.
- Contribute to the exploration and implementation of new data technologies, participating in code reviews, and supporting the team's continuous learning and improvement processes.