Lead Big Data Engineer
Full Time
Islamabad
Posted 22 seconds ago
Job Overview:
As a Big Data Engineer specializing in Java development, you will play a crucial role in designing, developing, and implementing large-scale data processing systems. Your primary focus will be on leveraging big data technologies to build robust and efficient solutions for managing and analyzing vast amounts of data. You will collaborate with cross-functional teams, including business analysts, customers, data scientists, data analysts, and software engineers, to deliver scalable and high-performance data processing pipelines.
Required Skills and Responsibilities:
- Designing and developing Java-based applications and frameworks for big data processing, storage, and analytics.
- Building and maintaining large-scale data processing pipelines using distributed computing frameworks like Apache Hadoop or Apache Spark.
- Implementing data ingestion processes to acquire, process, and store structured and unstructured data from various sources.
- Collaborating with data scientists and data analysts to understand their requirements and translate them into scalable and efficient data processing workflows.
- Optimizing and tuning the performance of big data applications by implementing parallel processing, distributed caching, and other optimization techniques.
- Developing and maintaining data models, schemas, and data dictionaries for efficient data storage and retrieval.
- Ensuring data quality, data integrity, and data security throughout the data processing lifecycle.
- Monitoring and troubleshooting data processing pipelines, identifying and resolving issues to ensure reliable and efficient operation.
- Keeping up to date with the latest advancements in big data technologies and evaluating their applicability to enhance existing systems or introduce new capabilities.
- Collaborating with the DevOps team to deploy, configure, and manage big data infrastructure and tools in a cloud or on-premises environment.
Qualification and Experiences:
- Strong proficiency in Java programming language and its ecosystem, including frameworks like Spring or Java EE.
- Experience with big data technologies and frameworks such as Apache Hadoop, Apache Spark or similar distributed computing frameworks.
- Familiarity with data processing and storage technologies like Apache Kafka, Apache Hive, Apache HBase, or NoSQL databases.
- Solid understanding of distributed computing principles and concepts, including parallel processing, fault tolerance, and scalability.
- Knowledge of data modeling techniques and experience with designing and optimizing data schemas for efficient data storage and retrieval.
- Proficiency in SQL and experience working with relational databases like MySQL, PostgreSQL, or Oracle.
- Strong problem-solving skills and the ability to analyze complex data-related issues and provide effective solutions.
- Familiarity with cloud platforms such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP) and visualization tools like Tableau/Power BI is a plus.
- Excellent communication and collaboration skills to work effectively with cross-functional teams.
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in big data engineering or a similar role.