Blue Ridge Global logo
Blue Ridge Global

Senior Data Engineer

RemoteFull-timeSenior🇪 GeorgiaDevelopment

As we transition to a modern big data infrastructure, PySpark plays a critical role in powering high-performance data processing. We are seeking a Data engineer / PySpark expert to optimize data pipelines, enhance processing efficiency, and drive cost-effective cloud operations. This role will have a direct impact on scalability, performance, and real-time data processing, ensuring the company remains competitive in data-driven markets.

You’ll be working closely with a Data Platform Architect and a newly formed team of four Data Engineers based in India (GMT+5:30) and one Data Engineer in Uzbekistan (GMT+5). Additionally, we're planning to hire two more Senior Data Engineers in Georgialater this year.In this role, you’ll report to the CTO, who is based in the GMT-8 time zone, and the VP of Engineering (EDT/EST).

Position Details

  • Role: Senior Data Engineer
  • Location: Remote (We’re looking for candidates based in Georgia, Romania, and the Czech Republic only)
  • Employment: Service Agreement (B2B contract; you’ll need a legal entity to sign)
  • Start Date: ASAP
  • Salary: $5,500 - $8,000 USD per month GROSS (fixed income, paid via SWIFT)
  • Working Hours: 11 AM to 7 PM local time. No night or weekend work is expected
  • Time Overlaps: Sync ups with RnD( Puna, India) in GMT+5:30 and devs in GMT-5, plus occasional meetings with the VP of Engineering in EST/EDT and the CTO in GMT-8.
  • Equipment: The company will provide a laptop.

What You’ll Be Doing

  • Optimize Data Processing Pipelines: Fine-tune PySpark jobs for maximum performance, scalability, and cost efficiency, enabling smooth real-time and batch data processing.
  • Modernize Legacy Systems: Drive the migration from traditional .NET, C#, and relational database systems to a modern big data tech stack.
  • Build Scalable ETL Pipelines: Design and maintain robust ETL/ELT workflows capable of handling large volumes of data within our Bronze/Silver/Gold data lake architecture.
  • Enhance Apache Spark Workloads: Apply best practices such as memory tuning, efficient partitioning, and caching to optimize Spark jobs.
  • Leverage Cloud Platforms: Use AWS EMR, Databricks, and other cloud services to support scalable, low-maintenance, high-performance analytics environments.
  • Balance Cost & Performance: Continuously monitor resource usage, optimize Spark cluster configurations, and manage cloud spend without compromising availability.
  • Support Real-Time Data Streaming: Contribute to event-driven architectures by developing and maintaining real-time streaming data pipelines.
  • Collaborate Across Teams: Partner closely with data scientists, ML engineers, integration specialists, and developers to prepare and optimize data assets.
  • Enforce Best Practices: Implement strong data governance, security, and compliance policies to ensure data integrity and protection.
  • Drive Innovation: Participate in global initiatives to advance supply chain technology and real-time decision-making capabilities.
  • Mentor Junior Engineers: Share your knowledge of PySpark, distributed systems, and scalable architectures to help develop the team’s capabilities.

Experience & Expertise:

  • 5+ years as a Data Engineer, with solid experience in big data ecosystems.
  • 7+ years of hands-on AWS experience is a must, including deep familiarity with EMR, IAM, VPC, EKS, ALB, and Lambda.
  • Cloud experience beyond AWS (GCP or Azure) is a strong plus.
  • Proficiency with Python (including data structures and algorithms), SQL, and data modeling.
  • Strong expertise in distributed computing frameworks, particularly Apache Spark and Airflow.
  • Experience with streaming technologies such as Kafka.
  • Proven track record optimizing Spark jobs for scalability, reliability, and performance.
  • Familiarity with cloud-native ETL/ELT workflows, data sharing techniques, and query optimization (e.g., AWS Athena, Glue, Databricks).
  • Experience with complex business logic implementation and enabling application engineers through APIs and abstractions.
  • Solid understanding of data modeling, warehousing, and schema design.

Soft Skills:

  • Strong problem-solving skills and proactive communication.
  • Fluent English - B2 and higher (both written and verbal).

Preferred Skills & Certifications:

  • Familiarity with .NET applications structure and deployment.
  • Relevant cloud certifications (AWS Solutions Architect, Developer, Big Data Specialty).
  • Certifications or proven experience in Databricks, Apache Spark, Apache Airflow, and data modeling are a plus.

Recruitment Process

  • # 1 Initial Interview: Up to 1 hour with HR or/and including a self-assessment form (Click to fill out the form). If you prefer, you can skip the call and discuss all questions and details in writing instead. Just let us know!
  • # 2 Managerial Interview (Optional): 30-60 minutes (You will meet with the CTO to learn more about the company, the position, and future plans directly from the source.)
  • # 3 Test Assignment: up to 113 minutes on iMocha platform (Data Structures - Graph data structure, Array and String manipulation - All in Python, with a few MCQ questions on Spark)
  • # 4 Technical Interview: Platform/Application Architect: up to 1h..FAQ – Technical Interview Format + Key Domains Covered - Show more
  • # 5 Offer & Paperwork: Up to 30 minutes with the CTO to finalize conditions and complete necessary paperwork.
  • # 6 Onboarding: Get ready to join the team and start your journey!

Ready to apply for this role?

Apply Now →

Related jobs

Apply Now →