|
Unbounded Environments
Today, bounded environments ensconced within clearly demarcated
perimeters are giving way to a milieu where gateways have
been rendered obsolete. In this environment, the distinction
between insiders and outsiders has blurred, and organisations
neither possess central administrative control over their
information systems nor do they have access to a global view
of events occurring therein. In such an environment, it is
virtually unfeasible to thwart cyber attacks. Traditional
information security models are ineffective when confronted
with the security problems associated with open-ended environments.
Since no system is entirely impervious to attacks in an unbounded
environment, there is now an intense focus on ensuring survivability
of mission critical systems and essential services, despite
the presence of cyber-attacks. Emerging technologies such
as grid computing and web services, make unbounded environments
even more vulnerable, mandating the need to build capabilities
into systems such that they have the resilience to survive
an attack and continue to fulfil their business mission in
a timely manner. The 'survive' philosophy of modern information
security is a paradigm shift from the 'prevent' viewpoint
of traditional security models.
Survivability and availability focus on preserving essential
services in unbounded environments, even when systems in such
environments are penetrated and compromised.
Survivability
Definition Survivability is defined as the capability
of a system to fulfil its mission, in a timely manner, in
the presence of attacks, failures, or accidents.
| In this definition: |
| :: |
System includes networks
and large-scale systems |
| :: |
Mission refers to a set of very high-level
or abstract goals of an organisation |
| :: |
Timeliness is of such
criticality that it is included explicitly in the definition |
| :: |
Attacks are potentially damaging events
orchestrated by an intelligent adversary |
| :: |
Failures are potentially damaging events
caused by deficiencies in the system or in an external
element on which the system depends
|
| :: |
Accidents describe a broad range of
randomly occurring and potentially damaging events such
as natural disasters
|
Characteristics of Survivable Systems
Identification and protection of essential services is a vital
ingredient of a practical approach to building and analysing
survivable systems. Maintenance of essential properties is
central to the delivery of essential services.
| :: |
Essential services are defined as those
functions of the system that must be maintained when
the environment is hostile, or when failures or accidents
occur that threaten the system
|
| :: |
Essential properties include specified
levels of integrity, confidentiality, performance, and
other quality attributes
|
Key to the concept of survivability is the
identification of essential services, and the essential properties
that support them, within an operational system. The overall
function of a system should adapt to preserve essential services.
Thus, the capability of a survivable system to fulfil its
mission in a timely manner is linked to its ability to deliver
essential services in the presence of an attack, accident,
or failure.
To deliver essential services, survivable systems should have
the following vital characteristics:
| :: |
Resistance to attacks |
| :: |
Recognition of attacks and the extent of
damage |
| :: |
Recovery of full and essential
services after attack |
| :: |
Adaptation and evolution to reduce effectiveness
of future attacks |
Developing Survivability Solutions
Survivability solutions are risk management strategies
that primarily depend on an intimate knowledge of the mission
being protected. The focus on the mission results in the extension
of survivability solutions beyond purely independent technical
solutions.
| :: |
Creating strategies Firstly,
risk mitigation strategies must be created in the context
of a mission's requirements, which are prioritised sets
of normal and stress requirements. They must be based
on "what-if" analyses of survival scenarios
and contingency planning.
|
| :: |
Forecasting scenarios
Survival scenarios positing a wide range of cyber attacks,
accidents, and failures assist in the analyses and contingency
planning. These scenarios focus on adverse effects rather
than causes. Effects are also of more immediate situational
importance than causes, because an organization will
likely have to deal with and survive an adverse effect
long before a determination is made as to whether the
cause was an attack, an accident, or a failure.
|
| :: |
Planning Contingency
and disaster planning requires that risk management
decisions and economic tradeoffs be made by executive
management, with guidance from technical experts in
the application domain, computer security, and other
software engineering and related disciplines.
|
Survivability depends equally upon the risk
management skills of an organization and upon the technical
expertise of information security experts. This is certainly
appropriate from an organizational perspective, because business
risk management is a primary responsibility of executive management,
and not the role of information security experts. The role
of the experts in security is to provide executive management
with the information necessary to make informed risk-management
decisions. Thus, the preparatory steps necessary for survivability
must be taken by an organization as a whole, rather than by
security experts alone.
Trends in Survivability Solutions New research
methods and tools are under development to support survivability
solutions encompass the following approaches:
| :: |
Designating a portion of the infrastructure
as the essential minimum and harden that portion against
attacks
|
| :: |
Making the requirements for survivability
explicit, identifying functionality whose absence currently
prevents adequate satisfaction of those requirements
|
| :: |
Exploring techniques for designing
and developing highly survivable systems, despite the
presence of untrustworthy subsystems and untrustworthy
participants
|
| :: |
Recommending specific architectural
structures that can lead to survivable systems and networks
capable of either preventing or tolerating a wide range
of threats
|
| :: |
Taking an adaptive control systems
perspective on survivability, which can continue to
provide control of a system in the face of disruption
to elements of the system and control system
|
| :: |
Examining survivability requirements for
real-time command and control systems |
Availability
Definition Availability is defined as a disciplined
methodology encompassing the entire IT infrastructure to ensure
guaranteed, consistent and predictable access to any component
of the infrastructure.
In this definition, it should be noted that Availability guarantees
that business systems continue to provide acceptable levels
of performance under normal as well as under unexpected events
and circumstances.
| :: |
Unexpected events Incidents
of lost data, system failures or unforeseen contingencies
are often termed as 'unexpected events' causing disruptions
to businesses. Over a reasonable period of time, every
organisation will experience something unpredictable
that shuts down one or more systems. It is not just
likely, but it is inescapable! Only the timing and precise
nature of unplanned downtime are unanticipated.
|
Characteristics of Availability
| :: |
Availability is a critical business
requirement to ensure the accessibility of information
resources as and when needed. Indeed, availability of
information is so critical that it forms one leg of
the 'CIA' (Confidentiality-Integrity-Availability) triad,
which is the foundation of information security.
|
| :: |
Availability is not a product or service,
but a set of practices utilized to ensure appropriate
levels of access to data and applications. It requires
detailed planning, meticulous implementation and periodic
review and implementation.
|
| :: |
The 'availability' of an information
system is measured not only by the ability of the system
design and implementation to satisfy required functionality,
but also by the adequacy of redundancy in terms of system
hardware, software and procedures built into the system
to safeguard against potential disruptions.
|
| :: |
Managing availability stretches beyond
analysing potential hardware failures. It involves managing
the whole environment (data, applications, servers,
operating systems, middleware, etc.) to ensure users
can access data and applications as and when, how and
where they need them. Ensuring availability entails
facilitating consistent and predictable access to information
resources.
|
| :: |
Ensuring availability does not always
imply keeping systems accessible all the time. For instance,
a business may not attach critical importance for system
downtime during non-business hours. However, what it
implies is that critical information must be accessible
during pre-specified critical hours, which could mean
24 hours a day for e-Businesses and Global Corporations.
|
Availability and Reliability
Often, the terms Availability and Reliability are equated
erroneously. Reliability refers to the Mean Time Between Failures
(MTBF) of a hardware component. Reliability is therefore a
component of availability. However, with the increasing quality
of hardware equipment, reliability is less of a concern to
most IT managers.
Developing Availability Solutions The availability
of Information Resources can be ensured by employing an appropriate
combination of tools, services and processes. The vulnerabilities
relating to availability in all data, applications, sites,
communication links, etc. should be analysed; the threats
to the individual components should be evaluated; and the
potential downtime cost should be assessed before a solution
is actually deployed.
A generic strategy for deploying an availability solution
encompasses the following:
| :: |
Environmental Assessment
The first step is to assess the existing environment
and determine the availability requirements. For each
IT component under consideration, the assessment should:
:: Determine the availability
needs and concerns
:: Determine the existing recovery levels
based on the current environment
:: Assess what changes can be implemented
to gain additional levels of recovery
:: Gain technical acceptance from the entire
IT department |
|
| :: |
Planning Services After
the existing environment has been analyzed and the availability
requirements assessed, the next step is to develop a
plan to implement a high availability solution. The
plan must define the:
:: Project Objectives
:: Project Team Members
:: Deliverables Timetable
:: Required tasks
:: The required IT infrastructure changes |
|
| :: |
Education & Training
Human error is a primary cause of unplanned downtimes.
A formal and periodic end-user and system operator training
program on the use and maintenance of Information Systems
can significantly reduce instances of unplanned system
downtimes. Similarly, a documented and tested recovery
plan can significantly reduce the duration of any outages.
|
| :: |
Availability Architecture
From the assessment and implementation processes outlined
above flows the architecture that requires to be put
in place to assure organizational availability of IT
infrastructure. The availability architecture would
include the following components/implementations: backup
& restore, third party recovery sites, journaling,
data vaulting, commitment control, uninterruptible power
supplies, fault tolerant hardware, raid, disk mirroring,
non-clustered multiple systems, clusters, alternate
communication paths, heterogeneous replication, auditing
services etc.
|
Conclusion
The natural intensification of offensive threats versus
defensive countermeasures has demonstrated time and again
that no practical systems can be built that are invulnerable
to attack. Despite the industry's best efforts, there can
be no assurance that systems will not be breached.
Thus, the traditional view of information systems security
must be expanded to encompass the specification and design
of systems that assure availability and survivability in spite
of attacks. Only then can systems be created that are robust
in the presence of attacks and are able to survive attacks
that cannot be completely repelled, assuring the organisation
availability of mission critical system.
|