Senior Data Engineer, Data Engineering & Bioinformatics

Up to US$160000 per annum + Highly Competitive Salary
  1. Permanent
  2. Manufacturing, Project Engineering, Operations Management
  3. United States
Princeton, USA
Posting date: 02 Nov 2023

Proclinical Staffing is partnering with a global biotechnology organization that is looking for a data engineer who has expertise in machine learning algorithms as they will take on projects involving OCR, LLM, and NLP components.

Must be eligible to work in the US.

Job Responsibilities:

  • Utilize the knowledge in machine learning, OCR, and intelligent document processing to develop and implement advanced data processing systems
  • Apply expertise in software development to design and build efficient data pipelines and solutions that improve current business process and reduce time value
  • Collaborate with cross-functional teams to understand business requirements and develop AI/ML applications that support digital transformation initiatives
  • Optimize data workflows and implement scalable solutions to accelerate AI adoption and streamline data processing
  • Design, implement and manage ETL data pipelines that ingest vast amounts of commercial and scientific data from public, internal and partner sources into various repositories on a cloud platform (AWS)
  • Enhance end-to-end workflows with automation that rapidly accelerate data flow with pipeline management tools such as Step Functions
  • Manage relationships and project coordination with external parties such as Contract Research Organizations (CRO) and vendor consultants/contractors

Skills and Requirements:

  • BS/MS in Computer Science, Bioinformatics, or a related field with 5+ years of software engineering experience or a PhD in Computer science, computational biology or related fields combined with 2+ years of experience in OCR, ML, Intelligent document processing, and software engineering
  • Strong knowledge in Machine learning, OCR, intelligent document processing, with strong software development skills
  • Excellent skills and deep knowledge in Python, Pythonic design and object-oriented programming is a must, including common Python libraries such as pandas. Experience with R a plus
  • Knowledge of ETL pipeline, automation, and workflow managements tools such as Airflow, AWS Glue, AWS Step Functions, and CI/CD
  • Understanding of the Databricks offerings including Delta Tables, Lakehouse architecture, and workflows
  • Solid understanding of databases and query engines
  • Solid understanding of AWS cloud computing services such as Lambda functions, EC2, Batch and other compute frameworks such as Spark, EMR, and Databricks
  • Proficiency with container strategies using Docker, Fargate, and ECR
  • Proficiency with Linux and shell scripting

If you are having difficulty in applying or if you have any questions, please contact Victoria Kroon at +(1) 929-387-4315 or

Proclinical is a specialist employment agency and recruitment business, providing job opportunities within major pharmaceutical, biopharmaceutical, biotechnology and medical device companies.

Proclinical Staffing is an equal opportunity employer.