Senior Data Engineer, Data Engineering & Bioinformatics
Proclinical Staffing is partnering with a global biotechnology organization that is looking for a data engineer who has expertise in machine learning algorithms as they will take on projects involving OCR, LLM, and NLP components.
Must be eligible to work in the US.
- Utilize the knowledge in machine learning, OCR, and intelligent document processing to develop and implement advanced data processing systems
- Apply expertise in software development to design and build efficient data pipelines and solutions that improve current business process and reduce time value
- Collaborate with cross-functional teams to understand business requirements and develop AI/ML applications that support digital transformation initiatives
- Optimize data workflows and implement scalable solutions to accelerate AI adoption and streamline data processing
- Design, implement and manage ETL data pipelines that ingest vast amounts of commercial and scientific data from public, internal and partner sources into various repositories on a cloud platform (AWS)
- Enhance end-to-end workflows with automation that rapidly accelerate data flow with pipeline management tools such as Step Functions
- Manage relationships and project coordination with external parties such as Contract Research Organizations (CRO) and vendor consultants/contractors
Skills and Requirements:
- BS/MS in Computer Science, Bioinformatics, or a related field with 5+ years of software engineering experience or a PhD in Computer science, computational biology or related fields combined with 2+ years of experience in OCR, ML, Intelligent document processing, and software engineering
- Strong knowledge in Machine learning, OCR, intelligent document processing, with strong software development skills
- Excellent skills and deep knowledge in Python, Pythonic design and object-oriented programming is a must, including common Python libraries such as pandas. Experience with R a plus
- Knowledge of ETL pipeline, automation, and workflow managements tools such as Airflow, AWS Glue, AWS Step Functions, and CI/CD
- Understanding of the Databricks offerings including Delta Tables, Lakehouse architecture, and workflows
- Solid understanding of databases and query engines
- Solid understanding of AWS cloud computing services such as Lambda functions, EC2, Batch and other compute frameworks such as Spark, EMR, and Databricks
- Proficiency with container strategies using Docker, Fargate, and ECR
- Proficiency with Linux and shell scripting
If you are having difficulty in applying or if you have any questions, please contact Victoria Kroon at +(1) 929-387-4315 or firstname.lastname@example.org.
Proclinical is a specialist employment agency and recruitment business, providing job opportunities within major pharmaceutical, biopharmaceutical, biotechnology and medical device companies.
Proclinical Staffing is an equal opportunity employer.