Data Engineer

Job Title: Data Engineer

Duration: Full Time Employee

Location: Houston, TX

Summary:
As a Data Engineer you will develop and maintain scalable data pipelines while collaborating with analytical and business teams to improve data models.

Responsibilities:

Implements processes and systems to monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
Writes unit/integration tests, contributes to engineering wiki, and documents work.
Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
Works closely with a team of frontend and backend engineers, product managers, and analysts.
Defines company data assets (data models), and other jobs to populate data models.
Designs data integrations and data quality framework.
Designs and evaluates open source and vendor tools for data lineage.
Works closely with all business units and engineering teams to develop strategy for long term data platform architecture.

Minimum Qualifications:

Preferable to have a Degree in an analytical field (e.g. Computer Science, Mathematics, Statistics, Engineering, Operations Research, Management Science) and 4+ years of professional experience.
At least 4 years of data analytics experience in a distributed computing environment
Database maintenance
Building and analyzing dashboards and reports
Evaluating and defining metrics and perform exploratory analysis
Monitoring key product metrics and understanding root causes of changes in metrics
Empower and assist operation and product teams through building key data sets and data-based recommendations
Automating analyses and authoring pipelines via SQL/python based ETL framework
Superb SQL programming skill.
Understanding of ETL tools and database architecture.
Advanced knowledge of data warehousing.
Strong knowledge of code and programming concepts. Experience with Python.
Experience with Kubernetes deployments and DevOps approach
Highly motivated self-starter who is flexible and goal oriented
Strong Python Knowledge
- Data Models
- Object-Oriented Programming
- Testing (Unit / Regression)
Database Experience
- Window Functions
- Partitioning/Indexes
- Relational and Non-Relational
Big Data Experience
- Hadoop
- Spark
- DataFrame API
Performance Benchmarking
- Cluster Configuration/Optimization
- Spark Optimzation
Version Control, CI/CD
- Git
- Jenkins, Drone
Some Cloud Experience
- AWS (primary), Azure, Google Cloud

Nice to Have

Data Science Experience (Either direct or from working closely with a DS team)
- - Scikit-Learn, Tensorflow, Spark.Mllib, General Algebra & Algorithms
- Airflow (Scheduling Tools)
- Container Experience
  - Docker, Kubernetes
- Streaming Experience
  - Kafka, Spark Streaming, Flink

Benefits and Additional Perks:

Fast paced, collaborative and fun environment
Work with data and latest technology to transform industry
Competitive salary and bonus
Medical, dental, vision, 401k, life and long-term disability insurance
Paid Time Off

Single.twig - Post Type: post

Data Engineer