Many features developed for large systems have now been incorporated into successively smaller platforms. One of the major features and one reason for the acceptance and growth of Compact PCI was in-service removal or hot-swapping, which allows removal of live parts of the system without affecting the overall performance.
Compact PCI, developed in the early 1990s, introduced some useful features previously only available on much more expensive systems. While electronics is getting smaller and more powerful every day because of the demand for increased performance, the power consumed by processors and chipsets is increasing and power produces heat. The knock-on effect of this is the need to ensure adequate ventilation and cooling air circulation through the PSU and other components to maintain them within their design parameters. Obviously, if insufficient airflow is made available, the power supply heats up and is more prone to failure.
Based on redundancy principles, all major components, such as the power supplies and cooling fans, in addition to the processor board and peripherals, were engineered to allow removal without powering down the complete system. Although in practice the live exchange was as uneventful as unplugging and removing a failed unit, and popping a service replacement into position, the background was somewhat more complex.
In a bus-based system, it is quite likely that the act of removal causes the processor to go into overload, looking for a component that was previously available (even if it was not functioning properly). If a cooling fan breaks down, will there be sufficient airflow through the remaining fans to keep the temperature within the design specification, or will it rise higher and thus cause further problems? With redundancy in power supplies it is also necessary to ensure they are suitably equipped for power sharing and to cope with any spikes when replacement power is switched on, apart from the economic considerations of having spare power supplies in the system.
Every component on a board is a potential heat source and may generate large amounts of heat. All the components must be kept below their designed temperature limits if reliability is to be assured. Cooling airflow has been constantly under demand as more and more is required of yet smaller systems, with the consequential increase in energy per unit volume. This is offset to some degree by other developments in lower power silicon, such as that incorporated in laptops and hand-held devices. The energy problem manifests itself in several ways. First, as more energy is consumed, so more heat is generated. Secondly, as the volume of the active components shrinks the (power) density increases, as does the demand on the cooling system components. Finally, as the overall system shrinks, there is a greater barrier to the cooling airflow as well as less available space to accommodate the cooling system and power without it becoming disproportionate to the active part of the system.
When fabric-based methods such as PICMG 2.16 were introduced, initially there was little change in the features except for a greater tolerance of additions and removals from the system, and far greater bandwidth. Shortly afterwards, the Advanced TCA platform was released, PICMG 3.0 which, while it addressed some of the size relationship issues by using -48V DC as the power source and converting to usable voltages onboard, it actually increased the plug-in board size to allow for this and for multiple function availability on the plug-in units.
Advanced TCA also heralded the benchmark of 200W per board with the need to provide cooling for normal use for an extended time during failure of one element of the cooling system. This was first achieved commercially using up rated blowers in a two-by-two redundant format.
Cooling and power redundancy were incorporated, with provision also being made for twin redundant shelf managers. Due to the increase in power to around 3kW per shelf, it was deemed unreasonable to distribute the lower voltages across the backplane, hence the -48V DC across the system with on-board conversion.
From Advanced TCA, with its application targeted squarely on telecoms, came derivatives. Initially, part of the flexibility of Advanced TCA was due to the provision of carriers to hold front pluggable subassemblies. These, also derived from previous form-factor mezzanine cards, soon became widespread as advanced mezzanine cards (AMCs).
As markets other than telecoms considered ways to incorporate parts of the Advanced TCA structure, it became obvious that smaller systems could be developed which, while being based on the AMCs of Advanced TCA, could dispense with one whole layer of interconnect by plugging directly into a high speed (fabric) backplane. The new, smaller, architecture was ratified as Micro TCA by PICMG and has since taken off as a smaller form-factor platform for reliable, yet complex, systems. Apart from the cost reductions achieved by direct connection of the AMCs to the backplane, many developers requested systems without the in-built redundancy initially envisaged (the telecoms market was still at this stage a major player in the Micro TCA product development), thus further reducing costs where the system application was not critical.
A common feature of Micro TCA is the management concept. Each AMC module, power module and the cooling system has to be provided with management control. The Micro TCA carrier hub (MCH) takes over the control of the individual management building blocks. Micro TCA defines so-called cooling units (CUs). One or two of these CUs can be in a system, with each unit consisting in turn of one or several fans. One CU should be sufficient for the cooling requirements of the system, with the optional second CU arranged as a redundant component, depending on the criticality of the application.
Cooling is still a significant consideration due partly to the wide range of configurations being developed for Micro TCA chassis systems as a result of the reliability and flexibility of the overall structure. It is also often ignored that although Advanced TCA had original design requirements to allow 200W per board, the 40W of a Micro TCA AMC is actually at twice the density of the Advanced TCA 200W, thus representing even greater problems for the chassis and system designer.
With the adoption of Micro TCA in a wider industrial base, AMCs, and the whole system, have been undergoing ruggedisation. This is not because of any inherent unreliability, but more due to the increased shock and vibration, wider temperature and humidity variations, airborne contaminants and other problems associated with the general industrial environment, which are not found in the normally benign, temperature controlled rooms with filtered air used to house telecoms equipment. There is a world of difference between a telephone equipment room and the general industrial shop floor where machinery operates.
To conclude, it can be seen that although there are continual difficulties for the enclosure manufacturer as silicon becomes ever more powerful and consistently smaller generation on generation, the interface features and cooling systems are still managing to cope with the complexity of demand. Will there come a day when the electronics is just too small to allow the necessary interfaces?
David Bowring is Rittal’s product manager for electronic systems