As a Senior Software Engineer, you will play a key role in developing the next- generation data mining and search platform. You’ll design and implement abstractions, data schemas, and APIs while working closely with our product engineers, product designers, and data scientists on new products. You’ll help scale our data infrastructure to seamlessly integrate an order-of-magnitude-more data sources by building systems to automate and optimize large scale distributed computing jobs. Your solutions will leverage distributed computing technologies to enable machine learning and NLP algorithms to be run on large scale data. As a data engineer, you will work with world-class data scientists, product designers, and engineers to create products that solve important real-world business problems in a collaborative, fast-paced, and fun startup environment.
Responsibilities:
Building and managing highly reliable distributed data pipelines with high throughput
Working with our data scientists to turn large scale messy and diverse unstructured data into structured, normalized data
Maintaining data integrity across various data sources
Optimizing slow running database queries and data pipelines
Helping enhance our search engine, capable of running sophisticated user queries quickly and efficiently
Building internal tools and back-end services to enable our data scientists and product engineers to improve efficiency
Requirements
BS, MS, or PhD in Computer Science or related field, or equivalent work experience
3+ years of experience in working with large scale data in a production environment