Data Engineer R&D (hybrid) - Nicosia

  • Location:

    Cyprus, Nicosia

  • Discipline:

    IT

  • Job type:

    Permanent

  • Benefits:

    Salary based on skills and experience and 13th salary and provident fund and and medical insurance after 6 months

  • Published:

    02-04-2024

  • Expiry date:

    02-07-2024

  • Reference:

    2343

Our client, a Cybersecurity Company in Nicosia, is looking to hire an experienced Data Engineer working with large language models (LLMs) to join the Research and Development team. This role is crucial for developing and maintaining scalable data pipelines and infrastructure to support the training and deployment of large language models. The ideal candidate will bring a blend of data engineering skills and a deep understanding of the intricacies involved in managing data for LLMs and other advanced modelling from preprocessing to optimization for performance at scale.

Responsibilities:

  • Design, build, and maintain scalable and efficient data pipelines specifically tailored for training and deploying large language models.
  • Work closely with data scientists and machine learning engineers to understand data requirements for LLM projects, including data collection, processing, and storage needs
  • Implement and manage data ingestion routines from a variety of sources, ensuring data quality and accessibility for LLM training
  • Optimize data infrastructure to support the computational demands of LLMs, including performance tuning and scalability improvements
  • Develop tools and processes for monitoring and analyzing data pipeline performance and data quality, ensuring the integrity and availability of data
  • Collaborate with cross-functional teams to ensure seamless integration of LLMs into production environments, including support for model versioning, deployment, and monitoring
  • Stay abreast of the latest developments in large language models, data engineering practices, and technologies to continually improve pipeline efficiency and model performance
  • Ensure compliance with data governance and security policies throughout the data lifecycle, from ingestion to model deployment.

Requirements:

  • At least 2 years of proven experience as a Data Engineer, with specific experience working on projects involving large language models
  • Strong expertise in data modelling, ETL processes, and data pipeline tools
  • Proficient in programming languages commonly used in data engineering and machine learning, such as Python and SQL. 
  • Experience with big data technologies (e.g., Hadoop, Spark) and cloud services (AWS, Google Cloud, Azure) tailored for machine learning and data processing workloads
  • Knowledge of containerization and orchestration technologies (e.g., Docker, Kubernetes) for deploying and managing LLM applications
  • Familiarity with machine learning operations (MLOps) practices for managing the lifecycle of machine learning models, including large language models
  • Excellent problem-solving skills, with the ability to work independently and as part of a team in a fast-paced environment
  • Strong communication skills, with the ability to explain complex technical concepts to non-technical stakeholders.
  • Fluency in Greek and English 

Working hours:

  • The working hours are 9am-6pm (20 min break), Friday afternoons off (hybrid working)

To apply:

Please send your CV to StaffMatters at admin@smstaffmatters.com and mention that you are applying for the vacancy of Data Engineer R&D (hybrid) with reference number 2343.
Or you can apply directly through your candidate login by hitting the APPLY button.