Analytics Engineer

We are both an electricity retailer and a tech platform and we think there is no better way to address our greatest challenge, climate change, than with the combination of those two.

Through our proprietary tech platform, Kraken, we are changing the way people interact with their energy company - by making it approachable, low cost, easy-to-understand, and most importantly, 100%% renewable. We’ve distinguished ourselves by being named 2020’s Energy Provider of the Year, which highlights our commitment to exceptional customer service. In many markets we are a leading employer on Glassdoor for best places to work.

At Octopus we’re focused on making energy fair, clean, and simple for all using technology. We’re looking for an Analytics Engineer that can help us with this challenge. Our data team is developing a data platform and providing data services to inform Octopus US business strategy and operations. This data platform enables self-service of data analytics for business stakeholders as well as automation of all our data workflows from ETL jobs to ML training and prediction. The data platform team works across the whole customer domain on anything from energy load forecasting to financial and customer data modeling.

Octopus Energy is growing fast and that means lots of data that needs to be ingested, organized, analyzed and shared with the team. You’ll work across all different parts of the business to understand what our teams need and deliver data pipelines and tools to meet them. Because it’s still early days, you’ll need to be versatile and be equally comfortable building robust production ready pipelines or hacking together a quick script to run on your machine. You’ll spend most of your time engineering, but you should also enjoy analyzing data and building data interfaces like dashboards or data applications.

You’ll be part of our global data platform team who will provide dev ops and infrastructure support as well as technical guidance. We’re building a consistent data platform across all Octopus retail businesses around the world so you’ll be part of and contribute to a global data community. This position is based in Houston, Texas.

What you'll do

Work with the data scientists, data analysts, and business stakeholders to scope out and plan new data sources and pipelines
Build, automate, deploy and maintain data models and workflows
Develop Streamlet data apps and lend a hand building and maintaining Tableau dashboards
Spearhead efforts on data monitoring and integrity to ensure accuracy of reporting.
Work with the global data platform team to deploy new tools and services into the US data environment
Participate in and contribute to our global data community
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources
Use an analytical, data-driven approach to drive a deep understanding of fast changing business
Build large-scale batch and real-time data pipelines with data processing frameworks

What you'll need

2+ years of experience in data engineering
First and foremost, we want our data engineers to be great software engineers with a passion for writing high quality code
It would be helpful to have experience/expertise in the following (in rough priority order):
- Python (in combination with Data Pipelines and Analytics)
- Advanced SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Experience modeling data for analytics - ideally experience using dbt as a modeling tool
- Experience building data pipelines in a cloud environment (ideally AWS)
- Spark
The projects will be varied and we’re looking for someone who can work autonomously and proactively to scope problems and solve and deliver pragmatic solutions
We want someone who is passionate about building great data tools for our business teams
Experience in the energy industry or enthusiasm for innovation in our energy system towards a more intelligent and clean grid are a big plus!
The ability to work alongside the team in our downtown Houston office

Our Data Platform Stack

Python as our main programming/scripting language
Kubernetes for data services and task orchestration
Airflow purely for job scheduling and tracking
Circle CI for continuous deployment
Parquet and Databricks Delta file formats on S3 for data lake storage
Spark and pandas for data processing
dbt for data modelling
Presto and SparkSQL for querying
Jupyter for data notebooks and ad-hoc analytics
Streamlit for data applications
Tableau for BI

‌