Now that we have determined the best geographic location for our data center, it is time to evaluate local facility options. The business concept of Splicing Fiber Optic Cableindustry clustering is valid in the data center industry. In most locations supporting carrier hotels and Internet Exchange Points you will normally see a large number of data centers within a very close proximity, offering a variety of options, and a maze of confusing pitches from aggressive sales people.

The idea of industry clustering says that whenever a certain industry, such as an automobile manufacturer selects a location to build a factory or assembly plant, others in the industry will eventually locate nearby. This is due to a number of factors including the availability of skilled workers within that industry, favorable city support for zoning, access to utilities, and proximity to supporting infrastructure such as ocean ports, rail, population centers, and communications.

The data center industry has evolved in a similar model. When you look at locations supporting large carrier hotels, such as Los Angeles, Seattle, San Francisco, London, and New York, you will also see there are many options for data centers in the local area. For example in Los Angeles, the One Wilshire Building is a large carrier hotel with collocation space within the building, however there are at many options within a very close proximity to One Wilshire, such as Carrier Center (600 W. 7th), 818 W.7th St., the Garland Building, 530 W. 6th, the Quinby Building, and several others.

The bay area has similar clusters stretching between Palo Alto and San Jose, and Northern Virginia (Ashburn, Reston, Herndon, Sterling, Vienna) has a high density of facilities in proximity to the large Equinix Exchange Point in Ashburn.

When you have data center clusters, you will also find each facility is either fully meshed with commercial dark fiber interconnecting the buildings, or has several options of network providers offering competitive “lit” services between buildings. 

Note the attached picture of downtown Los Angeles, showing all the major colocation facilities and physical interconnection between the facilties with high capacity fiber (Wilshire Connection).

Discriminating Features Among Data Centers

The Uptime Institute, founded in 1993 (and recently acquired by the 451 Group) has long been a thought leader in codifying and classifying data center infrastructure and quality standards. While many may argue the Uptime Institute is focused on enterprise data center modeling, the same standards set by the Uptime Institute are a convenient metric to use when negotiating data center space in a commercial or public data center.

As mentioned in Part one of this series, there are four major components to the data center:

  • Concrete (space for cabinets, cages, and suites)
  • Power
  • Air-conditioning
  • Access to telecom and connectivity

Each data center in the cluster will offer all the above, at some level of quality scale that differs from others in the cluster. This article will focus on facility considerations. We will look at the Uptime Institute’s “tiered” system of data center classification in a later post.

Wilshire Connection Los AngelesConcrete. Data centers and carrier hotels supporting major interconnection points or industry cluster “hubs” will generally draw higher prices for their space. The carrier hotel will draw the highest prices, as the value of being colocated with the telecom hub brings more value to either space within the meet-me-room, or adjacent space within the same building. Space within the carrier hotel facility is also normally limited (there are exceptions, such as the NAP of the Americas in Miami), restricting individual tenants to a few cabinets or small cages.

The attraction of being in or near the carrier hotel meet-me-room is not necessarily in the high cost cabinet or cage, it is the availability of multiple carriers and networks available normally with a simple cross connect or jumper cable, rather than forcing networks and content providers to purchase/lease expensive backhaul to allow interconnection with other carriers or networks collocated in a different facility.

Meet-me-rooms at the NAP of the Americas, 60 Hudson, the Westin Building, and One Wilshire in the US, and Telehouse in London offer meet-me-room interconnections with several hundred potential interconnection partners or carrier within the same main distribution frame. Thus the expensive meet-me-room cabinets and cages make up their value through access to other carriers with inexpensive cross connects.

NOTE: One thing to keep in mind about carrier hotels and meet-me-rooms; most of the buildings supporting these facilities were not designed as data centers, they are office conversions. Thus the electrical systems, air-conditioning systems, floor loading, and security infrastructure are not as robust as you might find in a nearby facility constructed as a data center or telecom central office.

Facilities near the carrier hotel will generally have slightly lower cost space. As industry concerns over security within the carrier hotel increase, and the presence and quality of adjacent buildings exceeds that of the carrier hotel, many companies are reconsidering their need to locate within the legacy carrier hotel. In addition, many nearby collocation centers and data centers are building alternative meet-me-rooms and distribution frames within their building to accommodate both their own tenants, as well as offering the local community a backup or alternative interconnection point to the legacy carrier hotel.

This includes the development of alternative and competitive Internet Exchange Points.

This new age of competitive or alternate meet-me-rooms, multiple Internet Exchange Points, and data center industry clusters gives the industry more flexibility in their facility selection. In the past, Hunter Newby of Allied Fiber claimed “if you are not present in a facility such as 60 Hudson or the Westin Building, you are paying somebody else to be in the building.” This has gradually changed, as in cities such as New York a company can get near identical interconnection or peering support at 111 W. 8th St or 32 Ave of the Americas as available within 60 Hudson.

As the clusters continue to develop, and interconnections between tenants within the buildings become easier, then the requirement to physically locate within the carrier hotel becomes less acute. If you are in Carrier Center in Los Angeles, the cost and difficulty to complete a cross-connection with a tenant within One Wilshire has become almost the same as if you were a tenant within the One Wilshire Building. Ditto for other facilities within the industry cluster. In fact, the entire metro areas of New York, the bay area in Northern California, Northern Virginia, and Los Angeles have all become virtual extensions of the original meet-me-room in the legacy carrier hotel.

The Discriminating Factor

Now as potential data center tenants, we have a somewhat level playing field of data center operators to choose from. This has eliminated much of the interconnection part of our equation, and allows us to drill into each facility based on our requirements for:

  1. Cost/budget
  2. Available services
  3. Space for expansion or future growth
  4. Quality of power and air conditioning

Part four of this series will focus on cost.

As always, your experiences and comments are welcome

John Savageau, Long Beach

Prior articles in this series:

Wilshire Connection photo courtesy of Eric Bender at www.wilshireconnection.com

Data center selection is an exercise in compromise. Everybody would like to have the best of all worlds, with a highly connected facility offering 24×7 smart Selecting the Data Center Locationhands support, impenetrable security, protection from all natural and man-made disasters, in addition to service level agreements offering 5-Nines power availability at $.03/kW. Not likely we will be able to hit all those desired features in any single facility.

Data center operators price their facilities and colocation based on several factors:

  • Cost of real estate in their market
  • Cost of power and utilities in their market
  • Competition in their market
  • Level of service offered (including power, interconnections, etc)
  • Quality of facility (security, power density, infrastructure, etc)

Networks, Content Providers, Enterprises, and Eyeballs

The basic idea of an Internet-enabled world is that eyeballs (human beings) need to access content, content needs access to eyeballs, eyeballs and content need access to networks (yes, eyeballs do need to communicate directly with other eyeballs), and networks need access to content and eyeballs. Take one of the above out of the equation, and the Internet is less effective. We can also logically add applications to the above model, as applications are now communicating directly with applications, allowing us to swap eyeballs for apps to complete the high level model.

Organizations using the Internet fall into a category of either a person, an application (including enterprise, content, and entertainment applications), or a network (including access, regional, and global networks).

Each potential organization considering outsourcing some or all of their operations into a data center needs to ask themselves a few basic questions:

  1. Is the organization heavily dependent on massive storage requirements?
  2. Is the organization highly transaction-oriented? (such as a high volume eCommerce site)
  3. Is the organization a content delivery network/CDN, requiring high bandwidth access to eyeballs?
  4. Are your target applications or eyeballs local, regional, global?
  5. Is the company a network service provider highly dependent on network interconnections?

Storage and servers = high density power requirements. The more servers, the higher the operational expenses on both space and power. This would logically drive a potential collocation customer to a location with the cheapest power – Data Center Elementshowever that might be a location outside of central business districts, and possibly outside of an area well connected with domestic and international telecom carriers, network service providers, and access networks (including the cable TV networks serving individual subscribers).

Thus the cost of power and real estate might be favorable if you are located in Iowa, however bringing your content to the rest of the world may limit you to one or two network providers, which with limited competition will likely raise the price of bandwidth.

Locating your business in a city center such as New York or Los Angeles will give you great access to bandwidth through either a colocated carrier hotel or carrier hotel proximity. However, the cost of real estate and power in the city center will be a multiple of that you may find in areas like Oregon or Washington State.

In a perfect telecom world, all networks and customers would have access to dark fiber from facility-based carriers serving the location they are either located or doing business. Allied Fiber’s Hunter Newby believes that facility-based carriers should be in the business of providing the basic “interstate highway” of communications capacity, allowing any company who can afford the cost to acquire high capacity interconnections to bring their operation closer to the interconnection points.

If you follow the carrier world you will know that at least in the United States, carriers are reluctant to sell dark fiber resources, preferring to multiplex their fiber into “lit” circuits managed and provisioned by the carrier. Clearly that provides a lot more potential revenue than selling “wholesale” infrastructure. Also makes it a lot more expensive for a company considering collocation to locate their facility in a geography separated from the major interconnection sites.

The Business Case and Evaluation

Again, selecting your desired location or locations to outsource your business is a compromise. In the United States Virginia is a good location for power, and an expensive location for interconnecting and collocating. Los Angeles is among the lowest cost areas for interconnections, mid way up the power scale, but more expensive for space.

Consider the possibility of moving to a great location in Idaho, with low cost power, and low cost real estate. You build a 500,000sqft facility, with more than 300 watts/sqft power capability. Your first project supports more than 20,000 servers delivering Internet streaming media content. Your facility costs are low, but your network costs become very high. You cannot buy dark fiber from a facility-based carrier, and the cost of leasing 10G wavelengths is nearly $10,000/month per wavelength. You probably have 500GB of data to push into the Internet. Is the power cost vs. connectivity and bandwidth compromise in your favor?

Here is another exercise. Let’s say for argument, in a Los Angeles carrier hotel static costs may run:

  1. $1000/month for a cabinet in the carrier hotel, $500/month for a cabinet in nearby facility.
  2. $12/breakered amp (breakered amps are still the norm, moving to usage-based models)
  3. $200/month for a cross connection within the carrier hotel building
  4. $1000/month for a fiber cross connect to a nearby or adjacent building
  5. $1000/month for an Internet Exchange Point/IXP connection (if you are a network service provider)

NOTE: Los Angeles has several large carrier hotels in the downtown area, as does New York, with buildings such as 60 Hudson and 111 W. 8th offering potential tenants multiple options. Other cities such as Seattle, Miami, and Chicago have more limited options, with a single dominant carrier hotel.

If you are a medium sized network service provider, you may consider getting a couple cabinets in a nearby facility and acquire a couple fiber cross connections to one or more nearby carrier hotels. Get a cabinet within the carrier hotel, add high capacity switching or routing equipment in the cabinet, and then try to maximize the number of local cross connects with other networks and content providers, and connect to a local Internet Exchange Point for additional peering flexibility.

Then take your same requirement for both cabinet space and interconnections, and try the evaluation in several different cities and markets. Fit the cost into one of the above squares in the Data Center Basic Elements chart, and determine the cost for each component.

If your business requirement is more dependent on space, and that is the highest potential operational expense, then you need to consider which location will minimize cost increases in the other three quadrants while you evaluate the best location for meeting your space budget. If your requirement spans several different geographies, add the cost of interconnection between locations to your interconnection costs. Does the location give you adequate access to the target applications or eyeballs?

If you find that a location in Omaha, Nebraska, meets all your requirements, but your target audience also includes a high percentage in India or China, then the cost of getting to your eyeballs in both OPEX and performance may make the Nebraska site untenable – even though it meets your high level budget.

Enter the Cloud

Nearly all businesses and organizations now have an additional alternative. The virtualized commercial cloud service provider. Virtualization products have come a long way over the past couple years, and are maturing very quickly. CSPs such as Google, Amazon, Rackspace, and Layered technologies are providing very powerful applications support for small and medium business, and have become a very visible debate at the national level as governments and large corporations deal with questions of:

  • Focusing on their core competencies, rather than internal IT organizations
  • Building more efficiency into the IT infrastructure (heavy on energy efficiency)
  • Recovering space used by IT and computer rooms
  • Reducing OPEX spent on large IT support staff
  • Better technologies such as netboooks
  • And more…

Thus the physical data center now has competition from an unlikely source – the cloud. All new IT and content-related projects should consider cloud computing or software as a service (SaaS) models as a potential alternative to bricks and mortar data center space.

Many venture capital companies are now requiring their potential investments to consider a hosted or SaaS solution to outsource their office automation, web presence, and eCommerce applications. This is easily done through a commercial web service or cloud hosting company, with the additional option of on-demand or elastic expansion of their hosting resources. This may be the biggest potential competitor to the traditional data center. The venture community simply does not want to get stuck with stranded equipment or collocation contracts if their investment fails.

Disaster Recovery and Business Continuity

One final note on selecting your location for outsourcing. Most companies need some level of geographic diversity to fulfill a business need for offsite disaster recovery apps and storage, load balancing, proximity (to eyeballs and applications), and interconnections. Thus your planning should include some level of geographic diversity, including the cost of interconnecting facilities to mirror, parse, or back up files. The same rules apply, except that in the case of backup the urgency for high density interconnections is lower than the primary operating location.

This does raise the potential of using facilities in remote locations, or locations offering low cost collocation and power pricing for backups.

Links to Data Center Resources

Here are a couple links to magazines and eZines supporting the data center industry.

Part 3 will explore the topic of understanding the hidden world of data center tiers, mechanical and electrical infrastructure, and site structure.

John Savageau, Long Beach

Prior articles in this series:

28Oct09

Selecting Your Data Center Part 1 – Understanding the Market

Telecom Risk and Security Part 4 – Facilities

On October 14, 2009, in Internet and Telecom, by Administrator

A 40 year old building with much of the original mechanical and electrical infrastructure. A 40 year old 4000 amp, 480 volt aluminum electrical buss duct, which had been modified and “tapped” often during its life, with much of the work done violating equipment specifications. With the old materials such as buss insulation gradually deteriorating, the duct expanding and contracting over the years, the fact aluminum was used during the initial installation to either save money or test a new technology vision – it all becomes a risk. A risk of buss failure, or at worst a buss failing to the point it results in a massive electrical explosion.

Facility ExplosionSound extreme? Now add a couple of additional factors. The building is a mixed use-telecom carrier hotel, with additional space used for commercial collocation and standard commercial office space. This narrows it down to most of the carrier hotel facilities in the US and Europe. Old buildings, converted to mixed-use carrier hotel and collocation facilities, due mainly to an abundance of vacant space during the mid-1990s, and a need for telecom interconnection space following the Telecommunications Act of 1996.

Over the past four years the telecom, Internet, and data center industry has suffered several major electrical events. Some have resulted in complete facility outages, others have been saved by backup systems which operated as designed, preventing significant disruption to tenants and the services operated within the building.

A partial list of recent carrier hotel and data center facility outages or significant events include some of the most important facilities in the telecom and Internet-connected industry:

  • 365 Main in San Francisco
  • RackSpace hosting facilities in Dallas
  • Equinix facilities in Australia and France
  • MPT in San Jose
  • IBM facility in NZ
  • Fisher Plaza in Seattle
  • Cincinnati Bell

And the list goes on. Facilities which are managed by good companies, but have many issues in common. Most of those issues are human issues. The resulting outages caused havoc or chaos throughout a wide range of commercial companies, telecom companies, Internet services and content.

The Human Factor in Facility Failures

Building a modern data center or carrier interconnection point follows a fairly simple series of tasks. Following a data center design and construction checklist, with strict compliance to the process and individual steps, can often mean the difference between a well-run facility and one that is at risk of failure during a commercial power outage, or systems failure.

In the design/construction phase, data center operators follow a system of:

  • Determining the scope of the project
  • Developing a data center design specification based on both company/industry standards
  • Designing a specific facility based on business scope and budget, which will comply with the standard design specification
  • Publish the design specification and distribute to several candidate construction management companies and engineering companies
  • Use a strong project manager to drive the construction, permitting, certification, and vendor management process
  • Complete systems integration and commissioning prior to actual operations

Of all the above tasks, a complete commissioning plan and integration test is essential to building confidence the data center or telecom facility will operate as planned. Many outages in the past have resulted from systems that were not fully tested or integrated prior to operations.

Facility ChecklistAn example may be a breaker coordination study. This is the process of ensuring switch gear and panel breakers from the point of electrical presentation by the local power utility down to individual breaker panels are set, tested, and integrated according to vendor specification. Without a complete coordination study, there is no assurance components within an electrical system will either operate correctly during normal conditions, or operate correctly during equipment failures. An essential component of a complete systems integration test. Failure to complete a simple breaker coordination study during commissioning has resulted in major electrical failures in data centers as recently as 2008.

The InterNational Electrical Testing
Association (NETA) provides guidance on electrical commissioning for data centers under “full design load” conditions. This includes testing recommendations to test performance and operations including the sequence of operations for electrical, mechanical, building management systems/BMS, and power monitoring/management. The actual levels of NETA testing are:

  • Level 1- Submittal Review and Factory Testing
  • Level 2- Site Inspection and Verification to Submittal
  • Level 3- Installation Inspections and Verifications to Design Drawings
  • Level 4- Component Testing to Design Loads
  • Level 5- System Integration Tests at Full Design Loads

No company should consider collocation within a facility that cannot produce complete documentation that integration testing and commissioning was completed prior to facility operations – and that testing should be at NETA Level 5. In some cases, documentation of “retro” testing is acceptable, however potential tenants in a facility should be aware that is still a compromise, as it is almost impossible to complete a retro-commissioning test in a live facility.

Bottom Line – even a multi-million dollar facility has no integrity without a detailed design specification and complete integration/commissioning test.

The Human Factor in Continuing Facility Operations

Assuming the facility adequately completes integration and commissioning at NETA Level 5, the next step is ensuring the facility has a comprehensive continuing operations plan to manage their electrical (and mechanical/air conditioning) systems. There are two main recommendations for ensuring the annual, monthly, and even daily equipment maintenance and inspection plans are being completed.

Computerized Maintenance Management System (CMMS)

Data centers and central offices are complex operations. Thousands of moving parts, thousands of things that can potentially break or go wrong. A CMMS system tries to bring all those components together into an integrated resource that includes (according to Wikipedia)

  • Work orders: Scheduling jobs, assigning personnel, reserving materials, recording costs, and tracking relevant information such as the cause of the problem (if any), downtime involved (if any), and recommendations for future action
  • Preventive maintenance (PM): Keeping track of PM inspections and jobs, including step-by-step instructions or check-lists, lists of materials required, and other pertinent details. Typically, the CMMS schedules PM jobs automatically based on schedules and/or meter readings. Different software packages use different techniques for reporting when a job should be performed.
  • Asset management: Recording data about equipment and property including specifications, warranty information, service contracts, spare parts, purchase date, expected lifetime, and anything else that might be of help to management or maintenance workers. The CMMS may also generate Asset Management metrics such as the Facility Condition Index, or FCI.
  • Inventory control: Management of spare parts, tools, and other materials including the reservation of materials for particular jobs, recording where materials are stored, determining when more materials should be purchased, tracking shipment receipts, and taking inventory.
  • Safety: Management of permits and other documentation required for the processing of safety requirements. These safety requirements can include lockout-tagout, confined space, foreign material exclusion (FME), electrical safety, and others.

And we can also add additional steps such as daily equipment inspections, facility walkthroughs, and staff training.

SAS 70 Audits

The SAS 70 Audit is becoming more popular with companies to force the data center operator to provide audited documentation by a neutral evaluator that they are actually completing the maintenance, security, staffing, and permitting activities as stated in marketing and other sales negotiations.

Wikipedia defines a SAS70 Audit as:

“… the professional standards used by a service auditor to assess the internal controls of a service organization and issue a service auditor’s report. Service organizations are typically entities that provide outsourcing services that impact the control environment of their customers. Examples of service organizations are insurance and medical claims processors, trust companies, hosted data centers, application service providers (ASPs), managed security providers, credit processing organizations and clearinghouses.

There are two types of service auditor reports. A Type I service auditor’s report includes the service auditor’s opinion on the fairness of the presentation of the service organization’s description of controls that had been placed in operation and the suitability of the design of the controls to achieve the specified control objectives. A Type II service auditor’s report includes the information contained in a Type I service auditor’s report and also includes the service auditor’s opinion on whether the specific controls were operating effectively during the period under review.”

Many companies considering outsourcing within the financial services industries are now considering a SAS70 audit essential to considering candidate data center facilities to host their data and applications. Startup companies with savvy investors are demanding SAS70 audits. In fact, any company considering outsourcing their data or applications into a commercial data center should demand to obtain or review SAS70 audits for each facility considered.

Otherwise, you are forced to “believe” the words of a marketer’s spin, a salesman’s desperate pitch, or the words of others to provide confidence your business will be protected in another company’s facility.

You Have the Best Data CenterOne thing to keep in mind about SAS70 audits… The audit only reviews items the data center operator chooses to audit. Thus, a company may have a very nice and polished SAS70 audit documentation, however the contents may not include every item you need to ensure the data center operator has a comprehensive operations plan. You may consider finding an experienced consultant to review the SAS70 document, and provide any additional guidance on whether or not the audit actually includes all facility maintenance and management items needed to ensure continuing protection from mechanical, monitoring/management, electrical, security, or human staffing failures.

Finally, Know Your Facility

Facility operators are traditionally reluctant to show a potential customer or tenant their electrical and mechanical diagrams and “as-built” documentation for the facility. This is the point you would find a 40 year old aluminum buss duct, single points of failure, and other infrastructure designs and realities you should know before putting your business into a data center or carrier hotel.

So, when all other data center and carrier hotel facilities appear equal, in geography and interconnections, look at facilities which will incur the least impact if your interconnections are disrupted, and demand your candidate data center operator and hosting provider are able to provide you complete documentation on the facility, commissioning, CMMS, and SAS70.

Your business, the global marketplace, and network-connected world depend on forcing the highest possible standards of facility design and operation.

John Savageau, Long Beach

Other articles in this series include:

Tagged with:
 

An employee enters the meet-me-room at a major carrier hotel in Los Angeles, New York, or Miami. He is a young guy recently graduated from high school, hired to do cable removal for circuit disconnects at minimum wage. Although young, he has a wife and child, and has recently been fighting with in-laws over his ability to support a family. Frustration and anger overcome his emotions, and he turns to the ladder rack jammed with cable and starts hammering at the cables for all he is worth.

Network operations centers around the world see circuits dropping, and customers with critical financial, military, Internet, and broadcast news services are shut down. In the space of about one minute our young employee has taken down several thousand individual circuits, creating near chaos in the global telecommunications community.

In their report on Trusted Access to Communications Infrastructure, the NSTAC Vulnerabilities Task Force advises “”it is important to recognize that any one individual with malicious intent accessing any critical telecommunications facility could represent a threat. The threat of insiders performing malicious acts also transcends each type of site discussed in this document.”

Security in TelecomThe event noted in part 2 of this series describing the outages in Northern California following damage to a manhole housing telecom was real. The resulting disruption to regional communications was a wakeup call to the telecom community, law enforcement, and communities affected. It is clear the perpetrator knew what he was doing, and knew exactly what vulnerabilities the major telecom companies had which he could exploit.

There have been many other cases such as Level 3 Communications loss of a major core router in 2006 supporting regional Internet services in London due to theft, a break-in at BT’s switching facility in Birmingham during the same period resulting in the loss of thousands of telephone lines, showing this is not just an American problem, but a global vulnerability.

The message is clear, as an industry our most obvious threat to information and communications security is not a natural disaster, it is people with industry knowledge or access to our critical facilities.

The Telecom and Data Center/Carrier Hotel Industry’s Role in Managing Human Security Risks

Data centers and central offices are in a constant state of change, maintenance, and growth. While facility network operations staff are generally long term employees, with a history of employment and performance, many others entering our data centers are not well known to the landlord.

Janitorial and maintenance staff are normally contracted to vendors, mechanical and electrical workers are contracted to maintenance and engineering companies, and construction contractors often use temporary staff from agencies such as “Labor Ready” and other day labor companies. In most cases data center or landlord employees are given a cursory background check prior to employment, however others entering even critical areas within the data center or central office meet-me-room may be entirely unknown to the facility.

While normally under some level of supervision, or access management, contractors, maintenance people, and even data center tenants are often free to move around the facility without direct security observation. As shown above, it would only take an angry, disgruntled, or undisciplined person seconds to cause a major calamity in our global communications system.

In a worst case, that person may be a terrorist with a detailed plan to cause damage to the facility once given even minimal access. High voltage electricity, water systems from cooling infrastructure, or access to switching equipment and cable interconnections are all exposed within the data center, and any element could be used to cause a major disruption within the meet-me-room or data center.

Most carrier hotels are located in “mixed-use” buildings, in high-rises with additional tenants who may not even be in the data center or telecom industry. This compounds the problem, as those tenants are often reluctant to comply with security and access requirements at the level of a critical telecom facility.

The issue becomes even more acute when we realize that much of the infrastructure supporting carrier hotels transits “risers” between floors, often through floors occupied by non-telecom tenants who may have physical access to riser space within their offices.

Secure Your Manhole CoversThere are a few data centers within the United States where security is comprehensive enough to reduce the risk of malicious intent to a very low level. While many tenants find the access and supervision within the facility extreme, facility resources are protected from all but the most aggressive vandalism or attack.

The NSTAC recommends that in the US the telecom industry establish best-practices guidelines to screen personnel prior to unescorted or unrestricted access to critical facilities, such as carrier hotels and carrier central offices. This may include a national agency check to ensure the person requesting access does not already have a profile indicating they could potentially be a threat to the facility.

The US government may give this additional support, as much of the US government, state, and local communications services are supported either in carrier central offices or commercial carrier hotels.

Recommendations for the Communications Industry

While it is clear not all persons entering a data center or carrier hotel facility can be completely screened, there are tasks each carrier and commercial data center operations should complete. Those could include:

  • Complete background checks for all direct employees
  • Pre-employment screening which would include a personality profile (indicating if they are in a high risk category for emotional stress)
  • Supervision of all contractors on site by a direct company employee who is aware of the risk posed for each type of equipment in proximity to the contractor (such as electrical equipment <UPS, breaker panels, switchgear, chilled water pipes, etc>)
  • Training in situational awareness – being able to identify activities not normal for others in your facility
  • Cooperation with law enforcement and other agencies
  • Working with industry groups to create and follow an industry “best practices” for facility security and human resource management
  • Ensure at least in the streets and areas immediately adjacent to the facility all manhole covers and utility entry points are locked and secured, preventing persons from accessing telecom, electrical, and water infrastructure supporting the building

“Unfortunately our most likely enemies will throw explosives into unguarded cable interconnect rooms or drop cans of petrol into unlocked manholes. End of Cyber War. You might characterize this as the provenance of a 23 year old fundamentalist Skywalker with a cell phone modem and a wild-eyed cousin in Munich figuring out how to blow up the Internet Death Star and stop Predator attacks on his village. Totally asymmetric dude! (From Bob Fonow’s “The Death Star?: Cyber Security vs. Internet security”)”

The commercial operators of data centers and carrier hotels have a tremendous responsibility not only to their owners and shareholders, but also the global telecom community and global economic community. The potential impact, even in the short term to a malicious attack on a meet-me-room at One Wilshire, 60 Hudson, the Westin Building, Telehouse in London, or the NAP of the Americas would be immediate, and extremely disruptive.

Human factors are the threat. Let’s not forget the lessons learned over the past couple years, and keep diligent, have good human situational awareness, and understand the sense of urgency we must apply to ensuring our communications infrastructure is secure.

Let us know your opinions, experiences, and recommendations

John Savageau, Long Beach

Previous articles in this series

Tagged with:
 

February 1996. A half-ton bomb planted in a small truck near South Quay Station close to the recently renovated commercial district of Canary Wharf. The bomb detonated around 1900 hours, bringing down a six story building, and severely shaking Canary Wharf  Tower and other buildings around the Docklands area. The area, home to much of the telecommunications interconnection capacity connecting the UK and Europe to the rest of the world, is severely damaged and all surrounding activity disrupted.

Today the Docklands area continues to support many important, high density communications interconnection points, including Telehouse Europe, the London Internet Exchange (LINX), and the London Network Access Point (LONAP) – in addition to individual nodes and facilities operated by European and other international telecommunications carriers.

This includes companies operating submarine fiber optic cable systems. These densely interconnected areas are referred to as telecommunications “SuperNodes,” or if the facilities are located at individual facilities, “Carrier hotels.”

A Global IssueThe US National Security Telecommunications Advisory Committee (NSTAC) defines a carrier hotel (or SuperNode) as “conditioned floor space operated by a commercial landlord for the purpose of hosting multiple service providers.” The most well-known supernodes are 60 Hudson in New York City, The NAP of the Americas in Miami, One Wilshire in Los Angeles, and the Westin Building in Seattle.

Carrier hotels emerged in the late 1900s following the Telecommunications Act of 1996, which required US incumbent carriers to provide interconnection or collocation space for the new competitive carrier industry. The problem for the carriers, and opportunity for commercial building owners, was one of the carrier facilities exhausting available space.

The commercial landlords were able to provide building space, partially due to low occupancy in city center areas near large carrier central offices (such as Bunker Hill in Los Angeles) during the late 1990s, and competitive carriers were able to build out their interconnection infrastructure with little or no interference by the incumbent carriers.

Carrier hotels can also be considered “scale free,” with the only real limitation on growth being the physical space available within a property, as well as electricity and cooling for electronics and switching equipment. This may not even be a large problem, as much of the carrier hotel interconnection volume is done through “passive” cross connects. Cross connects fiber optic to fiber optic splicing which does not require local electronics, and thus is not directly vulnerable to cooling and power issues.

What is the Impact of Losing a Carrier Hotel or SuperNode?

Could another attack similar to the 1996 Docklands incident potentially have the impact of severing interconnection capacity between communications carriers, Internet service providers, and news or information resources?

The extent of disruption would depend on the amount of switching and multiplexing equipment and physical interconnection capacity each company locates within the Telehouse facility, or the immediate area.

This is a source of much debate. In the US, nearly all facility-based (own their own cable) carriers and large virtual carriers have numerous interconnection sites located throughout the country. The loss of a single node or interconnection facility would not significantly disrupt national or international communications.

The Federal Communications Commission provides guidelines for facility-based carriers through the Network Reliability and Interoperability Council (NRIC) which advises “carriers place a high priority on service reliability by building networks with alternative routes, backup facilities, and other assurance capabilities.”

The danger at the SuperNode or carrier hotel is not necessarily one of the incumbent or long distance facility-based carrier. It is more an issue with:

  • International carriers with only one or two physical landing points in North America (or Europe)
  • Local exchange carriers with limited interconnection capacity outside of the carrier hotel
  • Internet service providers operating in a smaller geography (Tier 3 access networks)
  • Hosting companies and content delivery providers with single or limited Internet access
  • Local fiber providers with limited diversity within a city center

This is actually quite alarming. When you start to consider the outsourcing industry, including cloud computing, entertainment, and the number of companies who do not have strong disaster recovery plans – including geographic diversity within their applications and communications access - the potential for disruption is high.

Most of the SuperNodes provide interconnections for more than 200 facility-based carriers, networks, content providers, cloud service providers, and other hosting or business outsourcing. Understanding the reality that we live in a very global economy, losing interconnection capacity of even one SuperNode could render a large percentages of the global financial, logistics, business-to-business, disaster response, and government communications inoperable for hours of days while restoral plans are either implemented or conceived.

Companies with hosted applications and data center presence either in or near the failure point could be isolated or destroyed. Hosted companies “single-threaded” with one carrier connection that using the carrier hotel for its main interconnection point would be shut down.

The bottom line, companies without a strong restoral, backup, disaster recovery, and physically diverse network will suffer a catastrophic failure of their systems, with the length of outage entirely dependent on the facilities ability to recover from an outage or failure.

If more than one SuperNode is disrupted, such as all facilities on the US West coast, international communications both on Internet links (the majority of international communications today) and dedicated capacity will cause significant damage and disruption to both US and international communications.

What Can Cause a Major Failure?

There are many factors to consider, both human and natural, when looking at global communications infrastructure. Just in the past 5 years we’ve seen significant submarine cable disruptions due to both undersea earthquakes and cable cuts to strong waves hitting cable landing facilities on the coasts. Carrier hotels are primarily located on the coasts, in large cities, due to the proximity of both submarine cables supporting international communications, and the fact most North American and European terrestrial cable routes tend to interconnect at major coastal cities.

Coastal cities are vulnerable to:

  • Earthquakes
  • Typhoon and Hurricane wind/storm swells
  • Tsunami
  • Tropical rain and flooding

Human factors are also a concern, with potential problems such as:

  • Civil disorder
  • Terrorist attack
  • Vandalism
  • Employees (disgruntled, human error, etc)

If you look at the streets adjacent to buildings such as One Wilshire, you can see the evidence of dozens of carrier tags trying to mark and protect their conduit routes running through the streets, and entering the carrier hotel facility at One Wilshire. Few of the manholes around the area are locked, and few if any local building security officers or police officers will ever challenge a company setting up a couple of traffic cones and entering the manhole.

The potential for human disruption, just by having access below the street level near a building such as One Wilshire or 60 Hudson could be extreme. From below ground potential terrorists have access to power substations, water lines, and hundreds of conduits supporting the entire metro area – including the carrier hotel. A well placed explosive below grade in downtown Los Angeles could potentially disrupt the communications of more than 450 network and Internet-connected companies operating within One Wilshire or immediately adjacent buildings.

Many of the carrier hotels do not have battery backup or even redundant power, as the “meet-me-rooms” fell under the “scale free” rapid growth in the late 1990s and 2000s when those rooms had little or no management, admistrative controls/regulation. This is gradually being brought under control in the largest facilities, and most smaller facilities such as the NAP of the Americas in Miami are very well controlled.

This was proven possible during the 1996 attack in London, and could occur again at any single, or multiple carrier hotel facilities located in the United States and other countries. It is a real problem, and one that is not lost on governments around the world.

What We Can, and Are Doing to Protect Our Communications Assets

The key to all applications and communications security is diversity and redundancy. Very few submarine cables are being built today without at least a diverse loop, or a restoral agreement with a competitive cable company. If there is a single location or cable disrupted across the oceans, and restoral capacity is planned, the problem can be managed.

For North American carriers and Internet Service Providers, having a network with multiple “peering” points in different geographic locations will minimize disruption, and in the case of most regional and global networks that is the case. In fact, most large Internet networks require interconnections in multiple locations before they will consider “peering” relationships. That is of course for both traffic management, as well as disaster planning.

This would mean an Internet Service Provider would best plan their network for both physical high capacity interconnections in multiple carrier hotels, but also peering or disaster peering plans for interconnecting at public peering points, such as PAIX, Any2, Equinix, and Telehouse in the US, or other major Internet Exchange Points (IXPs) in London, Amsterdam, Frankfurt, and other Asian cities.

For those carriers and ISPs planning long distance interconnections, care must be taken to ensure route diversity. In some cases, multiple carriers will purchase capacity on a wholesaler fiber provider’s infrastructure (such as Level 3 Communications, XO, and Time Warner), with the possibility several different network providers will buy capacity on their long distance route using the same cable system.

In many cases, such as cable landing stations dotting Long Island in New York, the actual cable connecting those facilities to the carrier hotels and their own cable capacity management facilities follow a single route. The risk is that a single backhoe, terrorist, or vandal could potentially cause serious international communications damage by simply cutting a trough across the roadway, or jumping into a manhole and cutting cable.

“Vandals are to blame for the massive phone and Internet outage in Silicon Valley on Thursday, an AT&T representative has confirmed.” (CNET News, 9 Apr 2009)

An incident in early 2009 near San Jose (California) where an individual performed a similar act of vandalism caused significant disruption across a large area in Northern California. The above story confirms the danger present when critical infrastructure is not adequately protected, and a single person can enter a manhole with the potential of such widespread impact.

Physical cable and route diversity guarantees should be part of every disaster recovery and route planning negotiation.

Those companies outsourcing their mission and company-critical data and applications must look at geographic diversity, with the ability to dynamically restart applications with industry and customer-acceptable recovery point and recovery time objectives. Cloud computing technology is getting closer to providing this for the future, but not quite ready for offering service level objectives.

The US Government Weighs In

The NSTAC believes the government should work with private industry to develop both operational best practices, as well as a solid, coordinated, threat warning system to assist carrier hotel, data center, and SuperNode operators to ensure the best level of security for national and global infrastructure.

Police departments should have some level of visibility into carrier hotels and SuperNodes, data centers, and telecommunications company central offices. Not because we want “big-brother” looking into our business, but because we want law enforcement to understand the nature of our telecom business, and what could potentially happen if human beings are able to damage local infrastructure (which includes emergency responder infrastructure).

The NSTAC recommends individuals employed at carrier hotels and critical infrastructure facilities go through an initial security check. This may be in part because the national authorities probably have either own communications running through SuperNodes, and have recognized there is a reasonable chance US government and military communications could also be damaged or disrupted in the event of a facility failure or loss.

The FCC and NSTAC also recognize the burden of responsibility ultimately falls on the individual networks and customers. Our economy and communications infrastructure depend on each company having good disaster recovery and diversity plans. Individual users must ensure we get service level agreements with a clause ensuring physical route diversity in backup and DR site interconnections.

ISPs need to multi-home their networks. Not just at a single interconnection point, carrier hotel, or IXP – but in separate facilities, preferably in separate geographies.

The government is working with representatives from the telecom, vendor (electronic switching equipment, etc), applications, business community, and government agencies on a continuing basis to ensure US policy is kept current, and the threat/risk of our current infrastructure is understood. The President’s National Security Telecommunications Advisory Committee (NSTAC) is now part of the US Department of Homeland security, and coordinates much of the discussion.

As users, we need to take action as well. We can do any or all of the following to ensure not only our security in global communications, but also at our businesses and home:

  • Ask your hosting provider if they have a disaster recovery plan – Get proof
  • Ask your network provider if they are multi-homed and multi-homed in multiple geographies – Get proof
  • Ask your provider if their physical diversity is using physically separate fiber routes
  • Ask your hosting provider if they have good coordination with law enforcement for local security – Get proof
  • Ask your international VPN (virtual private network) provider if their cable system has a restoral plan, or if you have geographic fail-over on a separate cable – Get proof

In short, the burden is ultimately on the end user to ensure their business or activity survives a major disaster. We must drive our vendors, and should seriously consider strongly supporting greater regulation and oversight of our critical infrastructure facilities to ensure we do not lose a resource that could potentially contribute to a global economic and communications catastrophe.

What are your concerns? Do you believe we are OK in our current telecom environment? Should we do more? Your comments are welcome.

John Savageau. Long Beach

Other articles in this series:

  • Risk and Security in the Telecommunications Industry Series – Part 1