Colo Chargeback Without Tears: A Reference Architecture for Shared-Facility Cost Allocation

Q: How do you allocate shared power in a colocation facility?

Meter per-rack PDU draw from DCIM at 15-minute granularity, aggregate to a monthly kW-month per rack, then allocate rack kW-months to tenant teams by a CMDB-maintained rack-ownership map. Idle facility overhead (cooling, UPS, lighting) is allocated as a fixed percentage on top, typically derived from facility PUE, and distributed across teams proportional to their rack-kW share.

Q: What tagging convention works for colo chargeback?

Tag at three levels: facility/cage, rack/U, and asset. The minimum viable set is cost-center, business-unit, application-id, and environment (prod/non-prod). Because most colo hardware cannot carry tags natively, the tagging lives in the CMDB and joins to meter data on asset-ID. Make the tag set mandatory at asset intake and audit it monthly.

Q: How do you reconcile colo chargeback with finance?

Start from the invoice line items you already receive (space, power, cross-connects, remote-hands, bandwidth) and build the allocation pipeline to sum back to those exact totals within a small tolerance. Publish a monthly reconciliation that shows invoice total, allocated total, and variance; finance will tolerate a defensible variance but will not tolerate a number that cannot be traced to a bill.

Q: What is the KISS principle for shared-cost allocation?

Per the FinOps Foundation's guidance, the KISS principle says pick the simplest allocation scheme your organization will agree to stop arguing about. Perfect allocation is unattainable; defensibility is the achievable goal. A hybrid allocation that uses rack-kW for direct costs and a fairness-weighted share for overhead is usually sufficient.

TL;DR

Public-cloud chargeback has mature reference architectures; colocation chargeback is still spreadsheets. A defensible colo chargeback architecture has four pieces: a metering layer (DCIM, switch counters, carrier bills), a tag store (CMDB joining assets to cost centers), an allocation engine that resolves monthly invoice lines against metered usage, and a reconciliation report that sums back to the invoice. Use a hybrid allocation — direct metered cost plus a fairness-weighted overhead share — and follow the KISS principle, not accounting perfectionism.

Key takeaways

Colo bills arrive as five to seven invoice lines; allocation must reconcile to the exact totals.
DCIM rack-PDU metering at 15-minute granularity is the minimum viable power source.
Bandwidth is 95th-percentile billed; allocation must use the same method, not average Mbps.
Cross-connects are per-link monthlies — allocate direct, do not spread.
Hybrid allocation (direct + fairness overhead) beats pure resource-weighted schemes in practice.

Why colo chargeback stays hard

Cloud chargeback became tractable because the provider sells you a granular, API-accessible billing feed. Colo is the opposite design. The provider sells you space, power, and connectivity in contract units that do not decompose to per-application. The provider has no interest in reporting your draw to your engineering teams; that is your problem.

Three structural problems follow.

Contractual granularity is coarse. A cage rents at a flat monthly space charge, a committed power draw with overage, and per-circuit cross-connects. None of those lines reconcile to "how much did Team A spend last month" without a metering layer you build yourself.

Metering lives outside finance. Per-rack power comes from DCIM or intelligent PDUs. Bandwidth comes from switch SNMP counters or carrier reports. Depreciation lives in fixed-asset accounting. None of these systems talk to each other natively.

Tags are a CMDB problem. Colo hardware does not carry a cost-center tag in firmware. The tag is a CMDB entry keyed on asset-ID or rack-U position, and it is only as good as the intake process that populates it.

The FinOps Foundation's shared-cost guidance is clear about the overall shape: pick an allocation strategy, pick a tagging strategy, pick a shared-cost strategy, and stick with them. The architecture below is how to operationalize that in a colo estate.

The invoice is the anchor

A typical colocation bill has five to seven line items. Any allocation architecture that does not reconcile to these exact lines is going to fail finance review.

Space — cage, rack, or cabinet rent. Flat monthly.
Committed power — contracted kW draw per month. Flat, with overage charge per kW above commit.
Overage power — above-commit consumption, billed per kW-month.
Cross-connects — per-link monthly, typically flat.
Bandwidth — 95th-percentile Mbps per carrier circuit, billed per Mbps-month.
Remote hands — per-hour, on-demand.
Environmental / shared services — sometimes itemized as a percentage uplift.

Build the allocation engine so each of these lines is allocated by its own rule. Do not average them into a $/rack blended rate; you will lose the signal that drives behavior.

Close-up of network cables and switch ports in a rack — Cross-connect counts and 95th-percentile bandwidth drive allocation of the connectivity bill.

The reference architecture

Four components, kept explicit.

1. Metering layer

Power: per-rack PDU draw from DCIM, 15-minute granularity, rolled to monthly kW-month.
Bandwidth: per-circuit 95th-percentile Mbps from switch counters, matched to carrier billing methodology.
Cross-connects: inventory from the colo operator's portal or internal CMDB.
Remote hands: ticket records from the operator's support system, tagged at ticket creation with a cost-center.

2. Tag store (CMDB)

The CMDB is the tag store. Every rack-U and every asset needs four mandatory tags:

cost_center
business_unit
application_id
environment (prod / non-prod)

Tag compliance is the difference between a chargeback that runs and a chargeback that collapses. Make tags mandatory at asset intake, audit monthly, and publish a compliance scorecard per team. Teams whose tag compliance falls below a threshold get a pro-rata allocation of untagged cost applied to their share — which is rare, painful, and extremely effective at driving compliance.

3. Allocation engine

The engine joins invoice lines to metered usage via the tag store:

Space: allocate rack-U-months proportional to each team's rack-U occupancy.
Committed power: allocate rack-kW-months proportional to each team's metered rack-kW share of total rack-kW.
Overage power: allocate only to teams whose racks exceeded their per-rack density target, not to all teams proportionally — this preserves the behavioral signal.
Cross-connects: allocate directly to the application-id that owns the connected circuit; no spreading.
Bandwidth: allocate at the 95th percentile per circuit, not average — matching the billing methodology is non-negotiable.
Remote hands: allocate directly per ticket cost-center tag.
Overhead (lighting, UPS, cooling not direct-metered): allocate proportional to rack-kW-months as a fairness-weighted uplift.

4. Reconciliation report

Publish monthly. Three columns per line item: invoice total, allocated total, variance. Acceptable variance is low single digits; anything higher means the allocation engine or the tag store has drifted. The reconciliation report is the artifact that turns chargeback from a political exercise into a finance document.

Allocation scheme: time-weighted, resource-weighted, or hybrid?

The AWS Transit Gateway chargeback example makes the same distinction in cloud terms. For colo, the three choices look like this.

Time-weighted. Allocate each shared cost proportional to time of use. Simple but unfair for bursty workloads — a team that runs a monthly batch pulls a month's worth of allocation for a day's worth of consumption.

Resource-weighted. Allocate proportional to actual metered consumption. Most fair but requires metering everything directly, which is rarely achievable for overhead categories.

Hybrid. Allocate direct costs resource-weighted and overhead costs by a fairness rule (most commonly rack-kW share). This is what converges in practice. It is defensible, it rewards efficient teams, and it does not pretend cooling load can be metered per application.

Pick hybrid. Do not argue about the other two past the first quarter.

The cloud/colo boundary: bandwidth egress

A hybrid estate that runs workloads in colo and cloud often sees a boundary problem: cloud egress shows up in the cloud bill; carrier transit shows up in the colo bill; neither directly tells you what a user-facing application's end-to-end network cost is.

The fix is to tag at the application-id grain in both directions. Cloud cost providers can apply FOCUS-aligned tags. Carrier circuits are fewer and flatter; assign each to an application-id in the CMDB. Then the allocation engine can sum both sides of the application's network footprint into one number. That number is what the application owner uses to make architecture decisions; the two underlying invoices stay separate for reconciliation.

What this buys you

Three things change when a colo chargeback runs monthly and ties to the invoice.

First, teams see their allocation next to their production outcomes. Power density becomes a first-class conversation, not a facility-team worry.

Second, the refresh cycle gets informed by the chargeback. A rack that consumes 8 kW today for a workload that runs on hardware depreciated in 2021 is a candidate for refresh on cost grounds, not only capacity grounds.

Third, hybrid decisions become defensible. When an application owner proposes moving a workload from colo to public cloud, the argument can be made in like-for-like allocated costs, not in a $/instance-hour versus $/rack-U mismatch.

What to skip

Do not try to build per-application power draw. It does not exist in most racks and will cost more to build than it saves. Allocate at rack-kW and accept rack as the finest grain.

Do not allocate cooling per application. Proportional to rack-kW is close enough, and the argument it provokes is not worth winning.

Do not make remote-hands tickets a shared overhead. They are direct costs; the team that called the ticket pays.

Do not publish a real-time dashboard before publishing a monthly reconciliation. The monthly reconciliation is what earns trust; the dashboard is what gets used to run the practice once trust is established.

The KISS rule, once more

The FinOps Foundation's Allocation capability and its shared-cost working group both land on the same point: there is no single correct allocation; there is only a defensible one your organization will stop arguing about. For colocation that means a hybrid scheme with rack-kW as the spine, cross-connects allocated directly, bandwidth at 95th percentile, overhead spread by fairness rule, and a monthly reconciliation report that always ties to the invoice.

If you get those five pieces right, colo chargeback stops being a quarterly crisis. It becomes a finance-grade input to the decisions that actually matter — refresh, relocation, and hybrid-cloud architecture.

Frequently asked questions

How do you allocate shared power in a colocation facility?

Meter per-rack PDU draw from DCIM, aggregate to monthly rack-kW, then allocate by CMDB-maintained rack ownership. Allocate cooling/UPS overhead proportional to rack-kW as a fairness uplift.

What tagging convention works for colo chargeback?

Four mandatory tags — cost_center, business_unit, application_id, environment — enforced at asset intake and audited monthly. Tags live in the CMDB and join to meter data on asset-ID or rack-U.

How do you reconcile colo chargeback with finance?

Publish a monthly report with invoice total, allocated total, and variance for each line item. Variance in low single digits is acceptable; higher variance means the engine or the tag store drifted.

What is the KISS principle for shared-cost allocation?

Pick the simplest defensible scheme and stop arguing. A hybrid allocation with rack-kW as the spine is what most mature practices converge on.

Sources

Published on hybridfinops.com — a RunAI Pilot publication.

Colo chargeback without tears: a reference architecture for shared-facility cost allocation.