Skip to content

This website uses cookies to provide features and services. By using the site you agree to the use of cookies.Cookie policy.  Close

Boston, Massachusetts Full Time Posted: Tuesday, 3 December 2019
Applicants must be eligible to work in the specified location
Vertex is seeking a unique individual that passion deep experience and passion for cutting-edge, cloud-based serverless, data technology and its looking to help define and implement a step change in science and our ability to help address human disease. Vertex is seeking a unique talent that can play a role at the intersection of science, data and technology to enable life changing impacts to people around the world. We seek a data technology leader who can work with our business and scientific strategists and drive a cloud data architecture that matches high ambitions. Vertex is a fast-moving organization which depends upon multiple technologies to compel our mission forward. Vertex is in a transformational period where we are accelerating our capabilities, technology and data to augment our scientific mission, enable Vertex to grow in scale and be a be on the forefront of science and medicine.

This position provides leadership and direction to our newly formed cloud & data function that will help revolutionize the way Vertex leverages data and the cloud to build new learning models in both our scientific and enterprise endeavors. Vertex is looking to embrace serverless as a core principle and enable microservice development as well as the ability to enable learning (ML/DL/AI.)

We are looking for a data architect with deep experience and the ability to be hands-on to transform how Vertex leverages the massive amount of internal and external data and to ensure unstructured data can be leveraged securely across multiple platforms. You will own the design of Vertex lakes and pipes, including architecture for ingestion, modelling, schema, metadata, quality, validation and ensuring data is optimized for analysis and ready for new learning models to be applied. We've embraced the serverless ethos where possible and look to architect for flexibility and scale. We need to enable scientists including computational chemists and geneticists to explore, develop and leverage new computational models to tackle difficult biological problems to help people. We need to get this data into the hands of the rest of the business.

This is a hands-on role for someone who wants to solve important scientific problems that depend on "big data" and enable new paths of innovation.

Key Responsibilities:

* Responsible for establishing leadership and deep technical expertise by developing a comprehensive data architecture that matches Vertex's strategy
* Work collaboratively with Solutions, API and Security architects to design and build first iteration internal data platform
* Employ an iterative approach to enable a rapid release capability
* Design modelling to handle the complexities on internal data (scientific and enterprise) as well as the ability to ingest large data sets from external sources
* Enable scale as the data sets are large
* Data Domain Modeling and Logical Modeling; Data Profiling and Quality Assessment
* Creating Data Flow Diagram, optimal design and integration with app design and flow
* Designing optimal schema, partitions and indexing for relational and NoSQL storage variations (Columnar, Key-value pair, Object/Document based on situation
* Designing Event Stream and Schema
* Overall ownership for the data architecture and detail design of the cloud event hub, message queue, micro-services, and application processing in addition to S3 bucket structure, data schemas, and user application schema. (eg EMR workload)
* Study existing information processing systems to evaluate effectiveness and develops new systems to improve production or workflows as required
* Help develop master data governance framework, including data governance strategy, approach, and roadmap.
* Establish data dictionary and authoritative sources for the core data elements
* Partner with and other enterprise data functions to drive the long-term development of data infrastructure, including data warehousing, reporting, and analytics platforms.
* Review architectural designs and IT solutions to ensure consistency, maintainability, flexibility.
* Architect and implement with other teams the full deployment, data capacity planning and security for all production client deployments.
* Collaborate and support internal development needs for the product development of our platform as well as be a part of the development.


* Expertise in AWS services such as Amazon Elastic Compute Cloud (EC2), Amazon Data Pipeline, S3, DynamoDB NoSQL, Relational Database Service (RDS), Elastic Map Reduce (EMR) and Amazon Redshift.
* Push the envelope; Imagine bold possibilities and work with our scientists and partners to find innovative new ways to satisfy business needs through Data/Business Intelligence cloud computing.
* 10+ years' experience of IT platform implementation in a highly technical and analytical role
* Experience working with many different database and analytical technologies including MPP databases, noSQL storage, Data Warehouse design and implementation
* BI reporting and Dashboard development exposure
* Expertise in collaborating with Computer Scientists expertise - algorithms, data structures, software engineering
* Able to assume and complete assignments independently
* Experience working with many different database and analytical technologies including MPP databases, noSQL storage, Data Warehouse design and implementation

Experience implementing enterprise data lakes
* Strong verbal and written communications skills and ability to lead effectively across organizations
* Hands on experience leading large-scale global data warehousing and analytics projects
* Demonstrated industry leadership in the fields of database, data warehousing or data sciences
* In-depth technical experience with data technologies such as Hadoop (HDFS, Hive, Map/Reduce, EMR), Spark, Snowflake, Presto, Kafka/Samza, BigQuery, etc.
* Ability to transform raw, noisy log level data into useful business fact tables
* ETL expertise

Masters or PhD in Computer Science, Physics, Engineering or Math.

Boston, Massachusetts, United States of America
12/3/2019 11:12:05 AM

We strongly recommend that you should never provide your bank account details to an advertiser during the job application process. Should you receive a request of this nature please contact support giving the advertiser's name and job reference.