Big Data Engineer (The Data Pipeline Innovator)

Tech Stack

BIG
DATA
PROCESSING
ENSURE
SCALABLE
SCIENTISTS
ENGINEER
QUALITY
STORAGE
COLLABORATE
PIPELINES

Job Description

Are you passionate about handling massive datasets and building the infrastructure that enables complex data analysis and machine learning at scale?

Do you excel in creating robust, scalable data pipelines that fuel data-driven decision-making?

If you’re ready to tackle the challenges of big data, our client has the perfect role for you.

We’re seeking a Big Data Engineer (aka The Data Pipeline Innovator) to architect and maintain high-performance data systems that empower analytics and support advanced data processing needs.As a Big Data Engineer at our client, you’ll collaborate with data scientists, analysts, and software engineers to design, implement, and optimize big data platforms.

Your expertise in data engineering, distributed systems, and cloud infrastructure will be critical to ensuring that our data ecosystem is efficient, reliable, and scalable.Key Responsibilities: Design and Build Scalable Data Pipelines: Architect and implement data pipelines for ETL processes using tools like Apache Spark, Kafka, and Hadoop.

You’ll create data workflows that handle high-volume, high-velocity data and ensure seamless integration across systems.

Optimize Big Data Storage and Processing: Develop and manage data storage solutions (e.g., HDFS, S3, Cassandra) that are optimized for performance and cost-efficiency.

You’ll configure distributed processing systems to support efficient data retrieval and transformation.

Collaborate on Data Strategy and Integration: Work closely with data scientists, analysts, and other engineers to align big data architecture with analytics goals.

You’ll ensure data availability and integrity across systems to support business objectives.

Implement Data Quality and Governance Standards: Develop processes and tools to monitor data quality and enforce data governance policies.

You’ll ensure data is accurate, reliable, and secure through regular checks and validation processes.

Enhance Data Processing with Automation: Use tools like Apache Airflow or AWS Glue to automate data workflows and reduce manual processing.

You’ll implement scripts and automation that streamline data handling and improve efficiency.

Monitor and Troubleshoot Data Systems: Use monitoring tools to track system performance and address issues proactively.

You’ll troubleshoot and resolve any bottlenecks or failures to maintain optimal data processing capabilities.

Stay Updated on Big Data Trends and Technologies: Keep up with advancements in big data technologies and tools.

You’ll integrate new techniques and platforms that align with business needs and promote innovation.