Any business or infrastructure that needs to operate 24/7 — with no downtime — requires a critical infrastructure. Many think of control rooms and command centers, specifically in areas such as emergency response, banking, transportation, and the like. However, at this point, most every company has some kind of critical infrastructure needs, large or small.
If you have any sort of data center in your organization, then you have a need for a critical infrastructure.
A single hour of outage – whether a result of a blackout, faulty tech, or, most likely, human error – can cost a company anywhere from thousands of dollars to hundreds of millions of dollars depending on its size and need for a critical infrastructure space.
With that in mind, its important to ensure that your space has as little risk of outage as possible.
Here are some critical infrastructure best practices to help make sure that you’re setting your critical infrastructure implementation up for success:
Understand Power Needs
From a critical infrastructure standpoint, you need to be able to power the space with pure power. That means it needs to be conditioned – the same way that a water supply needs to be conditioned.
If you can condition your power using the right technology and proper distribution, you won’t run into problems that many will.
When it comes to power in critical infrastructure systems, we’re talking about standby generators, uninterruptible power supplies, automatic or static transmit switchers, high-end cooling systems, and environmental control.
It’s not just about buying the equipment, either. You’ll also need to test and commission it properly.
The challenge – whether a legacy facility or a brand-new build – is to get a consultant that understands the market and the geography of the area. Is your area prone to earthquakes, tornadoes, or hurricanes? That will play a huge part in the power sources and backup sources you need.
Needs change every year – companies acquire new companies, inherit legacy critical infrastructures, they’re run and managed different ways, and consistency is lost. A proper consultant will look at the full lifecycle of care for these spaces, taking into account the past, present, and future of your organization.
Invest in the Right Technology, Not Most Expensive
When you design a new critical infrastructure implementation, you need to start with the basis of design document.
Get every stakeholder in the room to discuss what their needs and requirement are. It’s a vetting process when this occurs. When this document is created, people will become aware of the risk, and the resiliency that the implementation requires.
That process is going to give the installer a tome of valuable information. Each line of business is going to require different capabilities, but that doesn’t mean that everyone needs Tier 4 solutions.
If you don’t have that initial discussion, you might decide to go with all Tier 4 (the highest level of reliability and redundancy) solutions right off the bat.
That’s going to cost a substantial investment, and you won’t learn until later that you might have overspent in certain areas where specific stakeholders didn’t need the top-of-line coverage you assumed they did.
Identify which sections of your business don’t need to be up 24/7, and which sections do.
Define your requirements, identify the cost of being down for an hour, and make your decisions based on the potential loss. If you do a good job with the basis of design, you’ll end up with the proper implementation.
Understand MGMT Needs & Train Operators
When you buy a new car, it comes with an owner’s manual. You can spend $100 million to build out a critical infrastructure – it doesn’t come with an owner’s manual.
It comes with a trunk of information that sits on the primary stakeholder’s shelf until they move to another company or position.
From the basis of design to the transition of operations, you need to put controls in place that allow you to understand how to manage the facility once you’re handed the keys. Most times, the proper programs aren’t in place.
The key is training, education, and a continued improvement process not just with people, but with equipment in the facility.
The majority of downtime in critical infrastructures occurs as a result of human error. The technology is the technology, but the human element is unpredictable. You need to use education and training as the tool to fight against human error the way you might use maintenance cycles to fight against technological error.
The challenge is a lack of talent. Students often don’t learn to operate critical infrastructure environments – they’ll learn electrical, mechanical, civil, and structural engineering, but not operation.
When a new employee comes in, start with a skills assessment. Identify those skills, strengths and weaknesses. Knowing the weaknesses is most important – once identified, you can put a program together that educates them and turns weaknesses into strengths.
Proper inspections every day, proper response, action, and standard operating procedures, and so on. You shouldn’t only have these in place, but you should have software programs that help educate employees on this.
That way not only does the staff understand how to operate the space, but new employees will, too. An orientation program keeps these spaces consistent and builds culture in the organization.
Keep in mind that the people that are running these spaces now will be retiring sooner than later. You should set up programs for those individuals to set up a transfer of knowledge to newer employees that will one day take over.
It’s people that run these facilities. If the critical infrastructure goes down, it hurts the company. If you don’t properly educate the employees maintaining and operating the facility, you’re setting yourself up for failure.
The Efficiency Factor
Both people and technology work to make a critical infrastructure efficient.
From the technology side, the equipment you buy must be efficient. You have to ensure they’re loaded properly. People are the ones that maintain them, though. You can become inefficient operationally, and in turn not set the proper thresholds, which will affect the technology.
Cooling systems are a great example. It’s an important and costly aspect of any data center. If you don’t have proper thresholds and set points calibrated appropriately, that data center might spend another 10% on cooling.
Over the course of a year that’s a large amount of money for electricity – a cost that could have been avoided by raising the set points to the industry standard. Human and technology work together to make that efficient – the human calibrates and the technology works based off of that set point.
You need both in tandem to remain efficient and eliminate unnecessary costs.
Let’s say you want to use a fuel cell for your data center, rather than going grid to chip for power. It can provide efficiencies above 65%, while the grid efficiency is only 33%. That becomes a problem. You have to outlay a bit more money, but you’ll see a return on your investment over the life of the building.
What makes a critical infrastructure environment robust is blending alternative energy with conventional energy. When the grid is out, then the alternative energy can provide some capacity to run processing.
Power, people, and processes – put best practices into those three areas and you’ll have an efficient data center.