How Hive balances platform stability with innovation
Hive has become one of the big successes of Centrica’s British Gas business, establishing the company as a viable alternative to Google’s Nest.
But being at the forefront of smart home technology means Hive requires a 24 by 7 way of working and an approach to software development that ensures there are no incidents on the back-end software platform on which Hive runs, while giving developers the freedom to create, build and deploy new features quickly and efficiently.
It starts with DevOps, but monitoring has become a key aspect of the DevOps process, and developers are expected to take full responsibility for the code they push into production.
“The challenge with Hive is that we are in quite an innovative space,” says Chris Livingston, head of cyber reliability engineering team for Hive Home at Centrica. “We know what good looks like and have a very clear idea of things we want to do and things we don’t want to do. But as we innovate there is a grey area in the middle.
“There is no up-front approval process at Hive,” he says. “Instead, the developer teams are provided with a set of guard rails that give our developers a lot of freedom, so long as they are doing everything right. We have a lot of continual compliance,”
As an example, Hive runs a million compliance checks an hour. For Livingston, monitoring is a joint responsibility. “The only people who know if their software is working are the people who wrote the code,” he says. “They have to make sure they send the right data to the monitoring system and they set the right thresholds.
“More and more people are running 24×7 services. The days of turning up to work at 9 o’clock and going how at 5:30 are a rarity. In my job, I work 24 by 7. If there is an issue with the system out of hours there is an expectation we fix it.”
Read more about Centrica
Livingston’s role is to run all of the infrastructure that keeps Hive running. He says this involves supporting all the teams developing for the Hive platform. “My job is to give the developers an environment where they can focus on their code.”
“We are very much trying to empower our developers to be responsible for the software and the services we develop. We want the developer teams to be 100% focused on delivering value and features to the customer.”
This involves providing an environment for developers to build, test and deploy the code they create. “I worry about monitoring, log aggregation, security and compliance,” says Livingston.
The cyber reliability engineering team provides a set of tools to support developer teams. He says the developer teams are “absolutely responsible” for monitoring the software they produce. “When there is a problem with their software out of hours, they are on call to fix it.”
“We define an incident as the software not doing what it is supposed to,” he says. “Sometimes, we can correct an incident before it becomes a problem, which is why Wavefront is useful.” If the system monitoring is trending in a way that could lead to an incident, the problem can be fixed before any issues arise, according to Livingston.
The entire end-to-end infrastructure on which the Hive Platform is based—including marketing and support websites, data collection services, and the real-time store for user and analytics data—runs on AWS technologies. “We’ve been in the AWS cloud from Day One,” says Livingston. The core technologies used to power Hive are Amazon Elastic Cloud Compute (Amazon EC2), Amazon Relational Database Service (Amazon RDS), and Amazon Simple Storage Service (Amazon S3).
A choice between private and public cloud
According to Livingston, up until now, businesses needed to make a choice between using a private or public cloud. He says: “Our developers don’t have to care about where their code runs. But having seen a VMware orchestration on top of AWS demonstration at VMworld in Barcelona, he says: “I can see a huge benefit, because you no longer have to chose [to deploy just] on-premise.
“As I look at all the products bridging physical on-prem and hybrid clouds, it is really powerful not to have to worry where your workloads are. You can have the best of both worlds and leverage all your legacy investments.”
Given that pretty much 100% of Hive runs on AWS, Livingston says: “We take a proactive view of cost management. For instance, the company uses a system that analyses AWS spending on a daily basis, which points out spending anomalies.”
He adds that the cyber reliability engineering team’s role is not to become a blocker. “I am trying to provide a set of tooling that enable developers do their work.”
However, there still needs to be some form of process. “I’m not a fan of process for process sake. But I believe good process can empower a business.”
He works with the developer to teams to create a process that works both for the teams and for the business. This means developers can deploy their own code. “We don’t work in a more traditional environment where someone else deploys code. Our developers have access to their production environment to deploy their code.”
Hive on Alexa speakers
Hive was selected by Amazon to be one of the Alexa Smart Home Launch Partners for the Amazon Echo in the UK in 2016.
Chris Livingston, head of cyber reliability engineering team for Hive Home at Centrica, admits his wife is not a big fan of Alexa. “There lots of gimmicky things on Alexa but then you find some really useful things.”
For Livingston, one of those useful features is being able to boost his Hive heating system. But this raises an interesting question which harks back to the launch of the Amazon smart speaker. The company was required to develop a set of default set of heating controls a user could speak, that Alexa could use to control Hive and other smart heating products.
Unfortunately this default vocabulary lacked one of Hive’s most useful features: the one touch boost option to switch on hot water or heating for an hour at a preset temperature. “We were very proud to be an Alexa launch partner but we had feedback from customers that we can’t boost heating,” he says.
This resulted in the company receiving plenty of negative feedback about the Hive skill for Alexa, even though the problem was actually with Amazon and its specifications for smart home voice control. To fix the problem, Hive needed to release a second Hive skill for Alexa, so that it could implement a voice command to support “boost”.