24-12-2025 13:45:18
Job_303508
3 - 5 years
Design, develop, and maintain reusable data pipelines and accelerator components using PySpark and Databricks.
● Collaborate with architects and senior engineers to translate functional requirements into scalable technical solutions.
● Perform unit and integration testing and support deployments across development and production environments.
● Optimize Spark workloads for performance, scalability, and cost efficiency within the Databricks
runtime.
● Implement and manage workflow orchestration using Databricks Workflows and/or Apache Airflow.
● Monitor and troubleshoot data pipelines to ensure reliability and SLA adherence.
● Maintain high code quality through version control, documentation, and peer reviews.
Required Skills & Experience:
● 3–5 years of experience in PySpark development and distributed data processing.
● Strong hands-on experience with Databricks (Notebooks, Jobs, Workflows, Delta Lake).
● Proficiency in writing efficient Spark transformations and Spark SQL queries.
● Experience building modular, reusable, and testable data pipelines.
● Exposure to workflow orchestration using Databricks Workflows or Apache Airflow.
● Familiarity with Git-based version control and basic CI/CD concepts.
● Basic understanding of cloud platforms such as Azure, AWS, or GCP.
● Good communication skills with the ability to document solutions clearly.