White Paper on Decentralized Model for Information Flow Control

dapat.reena · Nov 17, 2015

Description
The common modelsfor computersecurity are proving inadequate. Security models have two goals: preventing accidental or malicious destruction of information, and controlling the release and propagation of that information. Only the first of these goals is supported well at present, by security models based on access control lists or capabilities (i.e., discretionary access control, simply called "access control" from this point on).

Proceedings of the 16th ACM Symposium on Operating Systems Principles, Saint-Malo, France, October 1997
A Decentralized Model for Information Flow Control
Andrew C. Myers Barbara Liskov
MIT Laboratory for Computer Science
545 Technology Square, Cambridge, MA 02139
andru,liskov @lcs.mit.edu
Abstract
This paper presents a new model for controlling information
?ow in systems with mutual distrust and decentralized au-
thority. The model allows users to share information with
distrusted code (e.g., downloaded applets), yet still control
how that code disseminates the shared information to others.
The model improves on existing multilevel security models
by allowing users to declassify informationin a decentralized
way, and by improving support for ?ne-grained data sharing.
The paper also shows how static program analysis can be
used to certify proper information ?ows in this model and to
avoid most run-time information ?ow checks.
1 Introduction
The common models for computer security are proving inad-
equate. Security models have two goals: preventing acciden-
tal or malicious destruction of information, and controlling
the release and propagation of that information. Only the
?rst of these goals is supported well at present, by secu-
rity models based on access control lists or capabilities (i.e.,
discretionary access control, simply called “access control”
from this point on). Access control mechanisms do not sup-
port the second goal well: they help to prevent information
release but do not control information propagation. For ex-
ample, if user is allowed to read ’s data, cannot control
how distributes the information it has read. Control of in-
formation propagation is supported by existing information
?ow and compartmental models, but these models unduly
This research was supported in part by DARPAContract N00014-91-J-4136, monitored
by the Of?ce of Naval Research, and in part by DARPA Contract F30602-96-C-0303,
monitored by USAF Rome Laboratory.
Copyright c 1997 by the Association for Computing Machinery, Inc. Permission
to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for pro?t or
commercial advantage and that new copies bear this notice and the full citation on the
?rst page. Copyrights for components of this work owned by others than ACM must
be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to
post on servers, or to redistribute to lists, requires prior speci?c permission and/or a
fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481,
or “[email protected]”
restrict the computation that can be performed. The goal of
this work is to make information ?ow control more useful by
relaxing these restrictions.
Information ?ow control is vital for large or extensible
systems. In a small system, preventing improper propaga-
tion of information is easy: you don’t pass data to code whose
implementation is not completely trusted. This simple rule
breaks down in larger systems, because the trust requirement
is transitive: any code the data might travel to must also be
trusted, requiring complete understanding of the code. As
the system grows larger and more complex, and incorpo-
rates distrusted code (e.g., web applications), complete trust
becomes unattainable.
Systems that support the downloading of distrusted code
are particularly in need of a better security model. For ex-
ample, Java [GJS96] supports downloading of code from re-
mote sites, which creates the possibility that the downloaded
code will transfer private data to those sites. Java attempts
to prevent these transfers by using a compartmental security
model, but this approach largely prevents applications from
sharing data. Also, different data manipulated by an applica-
tion have different security requirements. A security model
is needed that supports ?ne-grained information sharing be-
tween distrusted applications, while reducing the potential
for information leaks.
This paper contains exploratory work towards a new
model of decentralized information ?ow that is also inex-
pensive in both space and time. Our model allows users to
control the ?ow of their information without imposing the
rigid constraints of a traditional multilevel security system.
The goal of this model is to provide security guarantees to
users and to groups rather than to a monolithic organization.
It differs from previous work on information ?ow control by
allowing users to explicitly declassify (or downgrade) data
that they own. When data is derived from several sources,
all the sources must agree to release the data.
Our model can largely be checked statically, like some
existing information ?ow models [DD77, AR80]. We de-
?ne user-supplied program annotations, called labels, that
describe the allowed ?ow of information in a program. An-
notated programs can be checked at compile time, in a man-
ner similar to type checking, to ensure that they do not violate
information ?ow rules. Compile-time checks have no run-
time overhead in space or time, and unlike run-time checks,
when they fail, they do not leak information about the data
the program is using.
Our work extends existing models by allowing individu-
als to declassify data they own, rather than requiring a central
authority to do it. In addition, we extend the static checking
model in three ways. First, we introduce an implicit form
of parametric polymorphism, called label polymorphism, to
express procedures that are parametric with respect to the
security labels of their arguments, and with respect to the
principal on whose behalf the procedure executes. Label
polymorphism extends the power of static analysis, and al-
lows programmers to write generic code. Second, since
purely static analysis would be too limiting for structures
like ?le systems, where information ?ow cannot be veri?ed
statically, we de?ne a new secure run-time escape hatch for
these structures, with explicit run-time label checks. Uses of
the run-time information ?ow mechanism are still partially
veri?ed statically, to ensure that they do not leak informa-
tion. Third, we show that despite these features, the labels of
local variables can be inferred automatically, easing the job
of adding ?ow annotations to a program.
Our goal in exploring these techniques is to eventually
support the following useful applications:
Secure servers and other heavily-used applications can
be written in programming languages extended with
information ?ow annotations, adding con?dence that
sensitive information is not revealed to clients of the
service through programming errors.
Secure compiled code may be transferred from a re-
mote site and run locally with less concern that it
might leak information. Code transfer is useful both for
clients, which download applications from servers, and
for servers, which upload code and data from clients
for remote evaluation.
The annotations could be used to extend many conven-
tional programming languages, intermediate code (such as
JVM [LY96]), or machine code, where the labeling system
de?ned here makes a good basis for security proofs [Nec97].
Labeled machine code and security proofs could work to-
gether: proof annotations for object code would be gener-
ated as a byproduct of compiling a program that contains
information ?ow annotations.
The remainder of the paper describes the model and how
checking is done. The model is intended to control covert
and legitimate storage channels; it does not deal with timing
channels, which are harder to control. The work assumes
the existence of a reliable, ef?cient authentication mecha-
nism, and of a trusted execution platform; for example, code
may be executed by a trusted interpreter, or generated only
by a trusted compiler. When the computational environ-
ment contains many trusted nodes connected by a network,
the communication links beween the nodes must be trusted,
which can be accomplished by encrypting communication
between nodes.
The organization of the remainder of this paper is as fol-
lows. Section 2 brie?y describes some systems that can ben-
e?t from decentralized information ?ow control, and which
are not well supported by existing models. Section 3 intro-
duces the fundamentals of the new information ?ow control
model. Section 4 discusses issues that arise when code using
the new model is statically checked for correctness. Sec-
tion 5 shows how the model can be integrated into a simple
programming language. Section 6 shows how to infer most
labels in programs automatically, making the job of anno-
tating a program much simpler. Section 7 describes related
work in the areas of information ?owmodels, access control,
and static program analysis. We conclude in Section 8 and
discuss future work in Section 9.
2 Motivating Examples
Let us consider two examples for which a decentralized
model of information ?ow is helpful — the medical study
and the bank, depicted in Figures 1 and 2. The scenarios
place somewhat different demands on the information ?ow
model. They demonstrate that our approach permits legit-
imate ?ows that would not be allowed with conventional
information ?ow control, and that it is easy to determine that
information is not being leaked.
In the ?gures, an oval represents a principal within the
system, and is labeled with a boldface character that indicates
the authority with which it acts. For example, in the medical
study (Figure 1), the important principals are the patient, p,
a group of researchers, R, the owners of a statistical analysis
package, S, and a trusted agent, E. Arrows in the diagrams
represent information ?ows between principals; square boxes
represent information that is ?owing, or databases of some
sort.
Each principal can independently specify policies for the
propagation of its information. These policies are indicated
by labels of the form : , meaning that owner allows
the information to be read by readers , where is a prin-
cipal and is a set of principals. The owner is the source
of the information and has the ability to control the policy
for its use. For example, in the medical study example, the
patient’s medical history may be read only by principals with
the authority to act on behalf of either the patient principal
or the hospital principal .
In the diagrams, double ovals represent trusted agents
that declassify information (for example, E in the medical
study). These agents have the authority to act on behalf
2
patient
data
patient p’s
medical
history
patient
data
extractor
researchers
statistical
database
{p: p,H}
{R: p, R}
E
R
(R: R,S)
(R: R,S )
{S: S}
results of
study
{}
hospital H
statistics
package
S
Figure 1: Medical Study Scenario
of a principal in the system, and may therefore modify the
policies that have been attached to data by that principal.
One goal of these two examples is show how our approach
limits the trust that is needed by participants in the system;
the double ovals identify the places where special trust is
needed.
2.1 The Medical Study
The medical study example shows that it is possible to give
another party private information and receive the results of its
computation while remaining con?dent that the data given to
it is not leaked. The purpose of the study is to perform a sta-
tistical analysis of the medical records of a large number of
patients. Obviously, the patients would like to keep speci?c
details of their medical history private. The patients give per-
mission to the researchers performing the study to use their
medical data to produce statistics, with the understanding
that their names and other identifying information will not
be released. Thus, the patients put some trust in the patient
data extractor, E, which delivers to the researchers a suitably
abridged version of the patient records. The data extractor
has the authority to act for the patient (p), so it can replace
the patient’s policy p: p,H with the researcher-controlled
policy, R: p,R , which allows the extracted data to be read
by the researchers and by the patient.
The researchers would like to use a statistical analysis
package that they have obtained from another source, but
the patients and researchers want the guarantee that the anal-
ysis package will not leak their data to a third party. To
accomplish this, the researchers relabel the patient data with
R: R,S . The analysis package is able to observe but not to
bank
response to
request
customer
C
B
customer
request
{C: B, C}
{C: B, C}
private
bank data
{B: B}
per-customer
account data
{C : B,C }
i i
total
assets
trusted
industry-standard
totaller
T
{C: B,C}
{B: B} {B: B}
Figure 2: Bank Scenario
leak the relabeled data since is only a reader, not an owner.
The analysis package performs its computations, using
the patient data, nowlabeled R: R,S , and its own statistical
database, labeled S: S . The writers of the analysis package
would also like some assurance that their statistical database
is not being leaked to the researchers. The result of the
computation must retain the policies of both R and S, and
therefore acquires the joint label R: R,S; S: S . This label
only allows ?ows to the principal S, since is the only
principal in both reader sets. The analysis package then
explicitly declassi?es the result of the computation, changing
the label to R: R,S so the researchers can read it. Note that
since the analysis package can declassify the analysis result,
it is not forced to declassify all information extracted from
the statistical database, which would probably require more
careful analysis of the analysis code to showthat the database
contents were not leaked.
Finally, the researchers may declassify the result of their
study, changing the label R: R,S to the unrestricted label
. This change allows the general public to see their results,
and is acceptable as long as there are so many patients in the
study that information about individual patients cannot be
extracted from the ?nal result.
This example uses declassi?cation in four places. Each
time, declassi?cation takes place according to the simple rule
that a principal may modify its own ?ow policies. Conven-
tional information ?ow control has no notion of declassi?-
cation within the label system, and therefore, cannot model
this example.
3
2.2 The Bank
The bank scenario is illustrated in Figure 2. A bank serves
many customers, each of whom would like to keep his data
safe from other customers and non-customers. In addition,
the bank stores private information, such as its current assets
and investments, that it would like to keep safe from all
customers and non-customers.
The bank receives periodic requests from each customer,
e.g., to withdraw or deposit money. Each request should be
able to observe only information that is owned by that cus-
tomer, and none of the bank’s private data. The bank is better
than real banks in that it allows customers to control dissem-
ination of their account information; each customer has a
distinct information ?ow policy for his account information,
which prevents the bank from leaking the information to an-
other party. The customer’s request, the account itself, and
the bank’s response to the request are all labeled C: B,C ,
allowing the bank to read the information but not to control
it. However, the bank’s private database, including its record
of total assets, is most naturally labeled B: B .
To keep the total assets up to date, information derived
from the customer’s request must be applied to the total
assets. To make this possible, the customer places trust in
the totaller, T, a small piece of the bank software that acts
with the authority of both the customer and the bank, and
therefore can declassify the amount of the customer request
in order to apply it to the total asset record. Conceivably,
the totaller is a certi?ed, industry-standard component that
the customer trusts more than the rest of the bank software.
Another reasonable model is that the totaller is part of an
audit facility that is outside the bank’s control.
3 Decentralized Information Flow Control
This section describes our new model of decentralized in-
formation ?ow. The model assumes a set of principals rep-
resenting users and other authority entities. To avoid loss
of generality, a process has the authority to act on behalf of
some set of principals.
Computations manipulate values. Values are obtained
from slots—variables, objects, other storage locations—that
can serve as sources and sinks for values, and from com-
putations; values can also be obtained from input channels,
which are read-only slots that allow information to enter the
system. A value can be written either to a slot or to an output
channel, which serves as an information sink that transmits
data outside the system.
Values, slots, and channels all have attached labels, which
are a more ?exible form of the security classes encountered
in most information ?ow models. The ?exibility introduced
by our labels makes decentralized declassi?cation possible.
The label on a value cannot change, but a newcopy of the
value can be created with a newlabel. When this happens we
say the value is relabeled, though it is really only the copy
that has a new label. The key to secure ?ow is to ensure
that any relabeling is consistent with the security policies of
the original labeling. Only values can be relabeled; slots
and channels cannot. This restriction allows us to check
information ?ows at compile time. If either slots or channels
could be relabeled, we would need run-time label checks
whenever they were used.
Sections 3.1 and 3.2 present our new model. Section 3.3
discusses principals in more detail and explains how we
achieve ?exibility even though slots and channels cannot
be relabeled. Section 3.4 de?nes the relabeling rules, Sec-
tion 3.5 explains output channels further, and Section 3.6
discusses howour model forms a conventional security-class
lattice.
3.1 Overview
A label contains a set of principals called the owner set,
or owners . The owners are the principals whose data
was observed in order to construct the data value; they are
the original sources of the information. For each owner ,
the label also contains a set of principals called the reader
set, or readers . The reader set for a particular owner
speci?es the principals to whom the owner is willing to re-
lease the value. Together, the owners and readers functions
completely specify the contents of a label. A useful concept
is the effective reader set of : the set of principals that all
owners of the data agree to allowto release it to. The effective
reader set is the intersection of every reader set in .
An example of an expression that denotes a label is
the following: o1: r1,r2; o2: r2,r3 , where 1, 2, 1, 2
denote principals. The owners of this label are 1 and 2, the
reader sets for these owners are readers 1 1 2
and readers 2 2 3 , and the effective reader set
is 2 .
This label structure allows each owner to specify an in-
dependent ?ow policy, and thus to retain control over the
dissemination of its data. Code running with the author-
ity of an owner can modify the ?ow policy for the owner’s
part of the label; in particular, it can declassify that data
by adding additional readers. Since declassi?cation applies
on a per-owner basis, no centralized declassi?cation process
is needed, as it is in systems that lack ownership labeling.
The labels maintain independent reader sets for each owning
principal. If, instead, a label consisted of just an owner set
and a reader set, we would lose information about the indi-
vidual ?ow policies of the owners and reduce the power of
declassi?cation.
The key to controlling information ?ow is to ensure that
the policies of each owner are enforced as data is read and
written. However, when a value is read froma slot, it acquires
the slot’s label, which means that whatever label that value
had at the time it was written to the slot is no longer known
4
when it is read. In other words, writing a value to a slot is a
relabeling. This loss of information is acceptable provided
there is no loss of control.
Therefore, we allow a value to be assigned to a slot only
if the relabeling that occurs at this point is a restriction, a
relabeling in which the new label allows fewer accesses than
the original; this happens, for example, if the slot’s label
allows fewer readers for an owner than the value’s label. A
relabeling from label
1
to label
2
is a restriction, written
1 2
, if
1
has more readers and fewer owners than
2
:
De?nition of
1 2
owners
1
owners
2
owners
1
readers
1
readers
2
Note that the rules for readers and owners are opposites.
We could have used a different model in which a slot
stores both a value and the label of that value. However, in
that model the label that is stored in the slot becomes another
information channel. Also, this model would not permit
compile-time label checking. Our approach does permit this
checking, and therefore labels cause little run-time overhead.
As discussed in Section 5, labels can be values themselves;
this feature allows label checking to be deferred till run time,
overcoming the limitations of doing all checking at compile
time.
In addition to slots, the system contains channels, which
allowinteraction with external devices: input channels allow
information to be read and output channels allow informa-
tion to be written. Reading from an input channel is just like
reading from a slot; the value is given the channel’s label.
However, writing to an output channel is different from writ-
ing to a slot; as discussed further in Section 3.5, writing to an
output channel is legal if the channel’s readers are a subset
of the readers allowed by the data being written. Creation of
new channels is obviously a sensitive operation.
In this model, it is safe for a process to manipulate data
even though the current principal does not have the right to
read it. This follows because all the process can do with the
data is write it to a slot or a channel provided the data’s label
allows this. Thus, access-control read checks aren’t needed!
Nevertheless, such checks might be desired, to reduce the
risk of exposure through covert channels of sensitive data
such as passwords. One possible extension would be to fold
read access checks into label checking: a process can read
information from a slot with label only if the process can
act for some principal in the effective reader set of .
3.2 Derived Labels
During computation, values are derived from other values.
Since a derived value may contain information about its
sources, its label must re?ect the policies of each of its
sources. For example, if we multiply two integers, the prod-
uct’s label must re?ect the labels of both operands.
When a program combines two values labeled with
1
and
2
, respectively, the result should have the least restric-
tive label that maintains all the ?ow restrictions speci?ed by
1
and
2
. This least restrictive label, the join of
1
and
2
(written as
1 2
), is constructed as follows: The owner
set of
1 2
is the union of the owner sets of
1
and
2
,
and the reader set for each owner in
1
and
2
is the inter-
section of their corresponding reader sets. This rule can be
written concisely, assuming the following natural de?nition:
readers for an that is not in the owner set is de?ned
to be the set of all principals, since imposes no restrictions
on propagation. The join rule is then the following:
Labels for Derived Values (De?nition of
1 2
)
owners
1 2
owners
1
owners
2
readers
1 2
readers
1
readers
2
(The symbol has also been used to denote the join of two
security classes [DD77, AR80].)
Note that
1 1 2
for all labels
1
and
2
. Joining
is a restriction and therefore it does not leak information.
3.3 The Principal Hierarchy
To allow compile-time analysis, slots must be immutably
labeled. Immutable slot labels might seem like a limitation,
but we provide two mechanisms to make the labels on slots
more ?exible: run-time labeling, which is discussed later,
and modi?cation of the rights of principals, which changes
the set of data that principals can read.
Within a system, principals serve various functions: some
represent the full authority of a user of the system; others
represent groups of users; and still others represent roles, re-
stricted forms of a user’s authority. In practice, these different
principals are used quite differently, and many systems treat
them as entirely different entities. Some principals have the
right to act for other principals and assume their power. For
example, every member of a group might have the right to act
for the group principal. The acts for relation is re?exive and
transitive, de?ning a hierarchy or partial order of principals.
This model is similar to a speaks for hierarchy [LABW91],
though roles are treated here as ?rst-class principals. We
assume that the principal structure can be queried using the
primitive acts-for function to discover whether the current
principal has the right to act for another principal.
The right of one principal to act for another is recorded in
a database. The database can be modi?ed: for example, to
alter the membership of groups, or to transfer a role fromone
employee to another. Obviously, modi?cations to the princi-
pal structure are extremely powerful and must be restricted
5
by some form of access control. Also, to prevent modi?-
cations to the principal structure from serving as a covert
channel, the principal database must be labeled in a way that
prevents information leaks, just as ordinary databases in the
system must be.
3.4 Relabeling Rules
This section restates and discusses our two relabeling rules,
restriction, which de?nes the legality of assignment, and
declassi?cation, which allows an owner to modify its ?ow
policy:
Rule 1: Restriction. A relabeling from
1
to
2
is valid
if it is a restriction:
1 2
. Intuitively, it removes readers,
adds owners, or both.
Rule 2: Declassi?cation. A declassi?cation either adds
readers for some owner or removes the owner . This
relabeling can be done only if the process acts for .
Relabeling by restriction is independent of the principal
hierarchy and requires no special privilege to perform. De-
classi?cation, by contrast, depends on the acts-for relation
and is legal only when the current process possesses the
needed authority. It introduces potential security leaks, but
only leaks of information owned bythe current principal. The
current principal has the power to leak its own information,
but cannot leak data owned by other principals. Information
that is owned by a particular user can only be declassi?ed by
code that runs with that user’s authority. Note that Rule 2
re?ects the transitivity of the acts-for relation. For example,
if a process can act for a principal , it can act for any prin-
cipal that can act for, and therefore can declassify data
owned by .
Analysis of the safety of a piece of code reduces to analy-
sis of the uses of these rules. As we will see, the rules can be
largely checked at compile time, assuming the programmer
is willing to supply some annotations. Code that performs
a relabeling according to Rule 1 can be checked entirely at
compile time, but code that performs a relabeling accord-
ing to Rule 2 requires additional checking, to ensure that the
process acts for the owner. In the language discussed later
in the paper, we require that the use of Rule 2 be indicated
explicitly, since legal but unintended information leaks could
occur otherwise.
An important property of these rules is that the join op-
erator does not interfere with relabeling. It is possible to
independently relabel a component of a join: if the rela-
beling
1 2
is legal, then for any other label
3
, the
relabeling
1 3 2 3
is also legal. This property
automaticallyholds for Rule 1 because the set of labels forms
a lattice. The property also holds for Rule 2 because the join
operator ensures that the ?ow policies of
3
are not violated.
This property is important because it permits code that is
generic with respect to a label (or part of a label) to perform
declassi?cation. It is also helpful for writing code like the
statistical analysis package in the medical study example.
Because Rule 2 can be applied to part of a join, the analysis
package can compute answers using values from its private
database, then remove its label from the result that it gener-
ates. Without this property, it would have to declassify the
private database information immediately—forcing it to give
up some protection against leaks.
3.5 Channels
Data enters the system through input channels and leaves
it through output channels. In either case, it is important
that the channel’s label re?ect reality. For example, if a
printer can be read by a number of people, it is important that
the output channel to that printer identify all of them, since
otherwise an information leak is possible.
Therefore, channel creation is a sensitive operation since
it must get the labels right. We assume it is done by trusted
code, which makes use of its understanding of the physical
devices to determine whether the requested channel should
be created, i.e., whether the label proposed by the program
attempting to create the channel matches reality. The channel
creator might additionally rely on authentication, e.g., when
a user logs on, the authentication or login process might
associate the user’s principal with the channel for the display
being used.
A value read from an input channel is labeled with the
channel’s label. This is the same as what happens when
reading a value from a slot. However, the rule for writing
to an output channel is different from writing to a slot, since
owners don’t matter for writing. Instead, the rule is the fol-
lowing: for any reader of the output channel, the effective
readers of the value’s label must include a reader such
that can act for . This rule ensures that the data can
read only by readers who have been authorized to see it. The
reason to use the acts-for relation is that by authorizing some
, we have implicitly authorized all principals who can act
for . This ?exibility provides functionality not easily at-
tainable through other mechanisms such as declassi?cation.
For example, it allows us to write data that is readable by a
group principal to a channel that is readable by a member of
the group, since the member principal can act for the group
principal.
3.6 Security Class Lattice
If we consider just Rule 1, the set of labels forms a conven-
tional security-class lattice [Den76], where each element in
the lattice is one of the possible labels. Labels exist in a
partial order as de?ned by the restriction operator, . The
least restrictive label, denoted by , corresponds to data that
can ?ow anywhere; the greatest restriction, , corresponds
to data that can ?ow nowhere: it is readable by no one and
6
owned by everyone. As we go up in the lattice, labels be-
come strictly more restrictive. Data can always be relabeled
to ?ow upward in the lattice, because restriction does not
create a possible information leak.
The lattice has a well-de?ned meet operator, . Its de?ni-
tion is precisely dual to that of : it takes the intersection of
the owners and the unions of the readers. The meet operator
yields the most restrictive label that is strictly less restrictive
than its operands. This operator doesn’t seem to be very
useful in describing computation, and its use is avoided in
order to preserve the ability to easily infer labels, as shown
in Section 6.
The effect of declassi?cation is that each principal has ac-
cess to certain relabelings that do not accord with the lattice.
However, these relabelings do not leak information.
4 Checking Labels
Labels can be used to annotate code, and the annotated code
can be checked statically to verify that it contains no infor-
mation leaks. In this section, we discuss some issues related
to static analysis of annotated code, though we defer issues of
how to extend a programming language till the next section.
In Section 4.1, we explain the importance of static checking
to information ?owcontrol, and the problemof implicit ?ows
[DD77]. In Section 4.2, we describe a simple way to pre-
vent implicit ?ows from leaking information by using static
analysis.
4.1 Static vs. Dynamic Checking
Information ?ow checks can be viewed as an extension to
type checking. For both kinds of static analysis, the com-
piler determines that certain operations are not permitted to
be performed on certain data values. Type checks may be
performed at compile time or at run time, though compile-
time checks are obviously preferred when applicable because
they impose no run-time overhead. Access control checks are
usually performed at run time, although some access control
checks may be performed at compile time [JL78, RSC92].
In general, it seems that some access control checks must be
performed dynamically in order to give the system suf?cient
?exibility.
By contrast, ?ne-grained information?owcontrol is prac-
tical only with some static analysis, which may seem odd;
after all, any check that can be performed by the compiler
can be performed at run time as well. The dif?culty with run-
time checks is exactly the fact that they can fail. In failing,
they may communicate information about the data that the
program is running on. Unless the information ?ow model
is properly constructed, the fact of failure (or its absence)
can serve as a covert channel. By contrast, the failure of
a compile-time check reveals no information about the ac-
Figure 3: Implicit information ?ow
tual data passing through a program. A compile-time check
only provides information about the program that is being
compiled. Similarly, link-time and load-time checks provide
information only about the program, and may be considered
to be static checks for the purposes of this work.
Implicit information ?ows [DD77] are dif?cult to prevent
without static analysis. For example, consider the segment
of code shown in Figure 3 and assume that the storage lo-
cations and belong to different security classes and
, respectively. (We will follow the literature in using the
notation to refer to the label of the expression .) In par-
ticular, assume is more sensitive than (more generally,
), so data should not ?ow from to . However, the
code segment stores 1 into if is true, and 0 into if is
false; effectively contains the value of . A run-time check
can easily detect that the assignment communicates
information improperly, and abort the program at this point.
Consider, however, the case where is false: no assignment
to occurs within the context in which affects the ?ow
of control. The fact of the program’s aborting or continuing
implicitly communicates information about the value of ,
which can be used in at least the case where is false.
We could imagine inspecting the body of the statement
at run time to see whether it contains disallowed operations,
but in general this requires evaluating all possible execution
paths of the program, which is clearly infeasible. Another
possibility is to restrict all writes that follow the statement
on the grounds that once the process has observed , it is irre-
vocably tainted. However, this approach seems too restrictive
to be practical. The advantage of compile-time checking is
that in effect, static analysis ef?ciently constructs proofs that
no possible execution path contains disallowed operations.
To provide ?exibility, some information ?ow checks are
desirable at run time; such checks are allowed as long as
their success or failure does not communicate information
improperly—which must itself be checked statically! We
examine run-time information ?ow checks later, in Sec-
tion 5.10.
4.2 Basic Block Labels
As described in Section 3.1, when a data value is extracted
from a slot, it acquires the slot label. Furthermore, to ensure
that writing to a slot does not leak information, the label on
the slot must be more restrictive than the label on the data
7
x := 0
branch on b
x := 1
(final)
F T
x := 0
if b then
x := 1
end
= b B
= B
B =
Figure 4: Basic blocks for an statement
value being written: is legal only if . However,
while this restriction condition is necessary, it is not suf?cient
for avoiding information leaks, because of the possibility of
implicit information ?ows. The code in Figure 3 provides
an example. The variables and are both slots. The
expressions 0 and 1 do not give any information about other
data, so they are labeled by . Therefore the assignment
appears to be legal. However, earlier we showed that
this assignment may be an information leak. Therefore, our
simple rule is not suf?cient. The problem becomes clearer if
we rewrite the statement:
Clearly, the value of after the statement completes is de-
pendent on the value of , and the assignment is legal only if
.
In general, the ?ow of control within a program depends
on the values of certain expressions. At any given point
during execution, various values have been observed in
order to decide to arrive at the current programlocation. Any
mutation that occurs can potentially leak information about
the observed values , so the slot that is being mutated must
be at least as restricted as the labels on all these variables:
1 2
This label can be determined through straightfor-
ward static analysis of the program’s basic block diagram,
and will be called the basic block label, B. The basic block
label indicates information that might be inferred by know-
ing that the program is executing this basic block. Using the
basic block label, we can write the correct rule for check-
ing assignment: assignment to a variable with a value is
permitted only if B .
Intuitively, a basic block label must include the labels
of all values that were observed to reach that point in the
execution. For example, consider the basic block diagram
shown in Figure 4, which corresponds to the code of Fig-
ure 3; each basic block, represented as a box in the diagram,
is characterized by a single basic block label, and has one or
x := 0
branch on b
x := 1
b := false
(final)
F T
x := 0
while b do
x := 1
b := false
end
B =
B =b
B =
B =b
Figure 5: Basic blocks for a statement
two exit points. Here the basic block for has label
because the value of had to be observed to reach that point
in the program. However, the label of the “?nal” block is
because at that point the program has no knowledge of any
values. It is true that the programcould discover information
by performing tests that read fromslots (e.g., ); however, the
basic block label captures what the program knows without
any further reading.
Labels of basic blocks are derived as follows. The de-
cision about which exit point to follow from a block is
made based on the observation of some value . The label B
for a particular basic block is the join of some of the labels
. A label is included in the join if it is possible to reach
from , and it is also possible to reach the ?nal node
from without passing through (if all paths from to
the ?nal node pass through , then arriving at conveys no
information about .) The set of that must be included in
each B can be ef?ciently computed using standard compiler
techniques. This rule for basic block label propagation is
equivalent to the rule of Denning and Denning [DD77].
Now, consider the execution of a “ ” statement, which
creates a loop in the basic block diagram. This situation is
illustrated in Figure 5. Note that for the ?nal basic block, we
obtain B by reasoning in the same way as we did for the
“ ” statement. This labeling might seemstrange, since when
we arrive at the ?nal block, we knowthe value of . However,
arriving at the ?nal block gives no informationabout the value
of before the code started, and there is no way to use code
of this sort to improperly transmit information.
This labeling rule holds as long as all programs terminate,
or at least as long as there is no way to derive information
from the non-termination of a program [DD77, AR80]. The
way one decides that a program has not terminated is to
time its execution, either explicitly or through asynchronous
communication with another thread. We do not address
timing channels in this paper.
If the language allows the raising of exceptions and of
return statements, the returned value must also be labeled by
the label of the basic block that contains the or .
8
This fact can be seen clearly by converting a procedure that
uses a return statement into one that uses boolean variables
to keep track of control ?ow.
5 Application to a Language
In this section we de?ne a simple programming language that
incorporates our model. The goal of this exposition is not to
seriously propose a programming language, but to demon-
strate that the information ?ow model can be applied prac-
tically to a rich computational model, providing sound and
acceptably precise constraints on information ?ow. These
annotations could be applied to other programming models,
such as compiled code or JVM code [LY96].
The language supports the usual simple types: integers,
strings, records, and arrays of any legal type including other
array types. Procedures may contain variable declarations,
assignments, statements, and statements; they return
results by assigning to special return variables, as in Pascal.
Variables of record or array types are references to storage
on the heap, as in Java [GJS96] and CLU [LAB 84], so that
assignment of a record or array (e.g., or )
makes the variables aliases for each other.
For simplicity, the language has no exceptions. Excep-
tions complicate the propagation of basic block labels, but
can be handled using the basic-block propagation described
in Section 4.2, assuming that procedures declare the excep-
tions they might raise. Not having exceptions makes some
programs clumsy, but programs written with exceptions can
be straightforwardly translated into programs without excep-
tions, so there is no loss of generality.
For simplicity of presentation, the language also lacks
global variables. In our simple language, the ?rst basic block
in a procedure has label . Global variables could be sup-
ported by allowing procedures to accept a special parameter
that de?nes the label of their ?rst basic block.
The language is extended with a few unusual features to
support information ?ow control:
All variables, arguments, and procedure return values
have labeled types. If a type is unlabeled, the label is
either a parameter or is inferred, depending on context.
An explicit “ ” operator allows the declassi?-
cation of information.
Aprocedure can explicitly test with the state-
ment whether it is able to act for some principal, and if
it is, it may use the authority of that principal.
A call to a procedure may grant some of the authority
possessed by the caller to the procedure being called,
and the called procedure may test and use that authority.
Variables and arguments may be declared to have the
special base type , which permits run-time label
returns
Return whether is the password of
while do
if
then
end
end
if acts for then
end
end
Figure 6: Annotated password checker
checking. Variables of type and argument-label
parameters may be used to construct labels for types
that are mentioned within the procedure body.
A statement can be used to determine the
run-time labeled type of a value, and a special type
conveniently encapsulates values along with
their run-time labels.
We begin with an example of a programto illustrate some
of the features in the language. Then we de?ne the language
in more detail, including how to check the constructs using
label-checking rules.
5.1 An Example
Figure 6 shows the procedure, which ac-
cepts a database of passwords, a password, and a user name,
and returns a boolean indicating whether the string is the right
password for that user. This example is simple, yet it uses
declassi?cation to control information ?ow in a ?ne-grained
way that is not possible under any previous model of which
we are aware.
Two principals are mentioned in this code: repre-
sents the password checker role, and represents the
principal of the calling client. The password database is an
array of records giving the passwords for users; it should be
protected by encapsulation (e.g., should be
a method) but this complexity is avoided here. In a real pro-
gram, would be a parameter to the routine rather than
9
a single ?xed principal; a more general (and more concise)
version of is shown later, in Figure 9.
In the ?gure, all labels are declared explicitly by annotat-
ing types with label expressions in braces. The type
describes a value of type that is restricted by the label
. For example, the argument is readable by
the checker but owned by the client, which ensures that the
checker can examine the password but cannot distribute it
further. The annotations are onerous in this case partly be-
cause variable labels are not being inferred; Section 6 shows
how to infer labels for this procedure.
The comments (beginning with a “%”) on the right-hand
side of the example indicate the static value of the basic block
label, B, and are not part of the code.
To perform its work, the procedure uses the database
to ?nd the password of the user. As it looks at the data, its
basic block label picks up any dependencies. For example,
the predicate examines and the ?elds of ;
therefore, its body has the basic block label
. This means that the label of must be at
least this restrictive, or the assignment to in the body
of the statement would leak information (i.e., would result
in a compile-time error).
However, the client requires a result with label
. To provide such a result, the checker must explicitly
declassify , removing from the owner set. The
declassi?cation can be carried out only if the procedure has
the proper authority (i.e., can act for ). Therefore, the
procedure checks whether it runs with this authority (in the
statement); in the clause, the code runs with
the authority of the password checker; otherwise, it does not.
Code receives authority by beinggrantedit inthe principal
hierarchy; each procedure has its own principal that can
participate in acts-for relations. Of course, granting code
the right to act for a particular principal can only be done
by a process that acts for that principal, so must have
explicitly granted the right to act for it. The
expectation of the author of the procedure
is that it has been given this right. However, the procedure
is constructed so that it does not leak information if the right
to act for is later revoked.
The need for explicit declassi?cation is not just an artifact
of our model. The return value of the procedure is owned by
the client, which means that the client has complete control
over the value. The procedure’s result could conceivably
be used to determine the contents of the password database
through exhaustive search. In this case, the implementor of
the procedure has made a conscious decision that the amount
of information leaked by the boolean return value is small
enough that declassi?cation is acceptable. Note that the
decision about declassi?cation is made locally, by code that
acts for the owner of the data; no appeal to an external trusted
agent is required.
It is not necessary to run the entire procedure under the
authority. Instead this authority is used just where it is
needed, for the declassi?cation. Also, note that the proce-
dure only needs to declassify its result. It is not required to
declassify everything extracted from the password database,
and is therefore protected against accidentally leaking the
database contents (e.g., if it called a helping procedure that
acted on the database).
5.2 Label-Checking Rules
Now, we explain the constructs of the language and the cor-
responding label-checking rules. The process of verifying a
program according to these rules involves two major steps:
?rst, basic block labels are propagated; then, each individual
statement is veri?ed in the context of the basic block that
contains it. Verifying a statement requires that we check for
the satisfaction of corresponding label constraints, which is
discussed in more detail in Section 6.
If the language contained exceptions or gotos, B would
need to be computed through the basic-block label propaga-
tion rule of Section 4.2. For our simple constructs, the rules
for propagation of B can be stated directly; since control ?ow
is determined only by and statements, the basic-block
propagation rule takes particularly simple forms. Note that
in each statement rule, B represents the label for the basic
block containing the statement. Many of these rules are simi-
lar in intent to those found in Denning and Denning [DD77],
though these rules are different in that they are expressed
without using the meet operator ( ).
5.3 Labeled Types
As described, values and slots have labels that restrict in-
formation ?ow. In statically-typed languages, values and
slots also have static types that can be said to restrict ?ow.
These two restrictions can be combined, so that labels are
considered to be part of the type. This interpretation allows
us to use standard type-checking and notions of subtyping to
describe information ?ows.
Every variable, argument, and return type in the language
has a labeled type, which consists of a base type such as
, plus a static label. For example, the labeled type
represents an integer restricted by . In general, label ex-
pressions, consisting of the join of several labels, may appear
within the braces. For example, :
2
is re-
stricted by both : and
2
.
The type and label parts of a labeled type act indepen-
dently. For any two types and where is a subtype of
( ), and for any two labels
1
and
2
where
1 2
,
1 2
[VSI96]. This formula also implies that
1 1 3
, for any other label
3
.
Parametric types such as arrays and records explicitly
mention labels on their type parameters. For example, we
can form the type , which is an array of
10
labeled integers, where the integers have label , and the
array reference has label . In record types, similarly, the
individual ?elds have labeled types.
5.4 Assignment
Given an assignment of the form , where is a variable
with type and is an expression with type , and
, the assignment is legal if B , where B is
the label for the basic block containing the statement. This
condition guarantees both that the information in is not
leaked by placing it in the variable , and that performing the
store operation does not leak information because it happens
in this particular basic block.
Record and array assignments are similar to variable as-
signment, except that they may convey information by virtue
of the particular record or array being assigned to. Consider
record assignment of the form . In this statement, is
a record expression with label , is a ?eld of the record type
with declared type , and is an expression with type
, . The assignment is legal if B . This
rule is equivalent to the one in Denning and Denning [DD77].
The rule for assignment to an array element is similar, except
that the label on the array index is included on the left-hand
side. Because appears on the left-hand side of the rule,
?elds and array elements become immutable if the variable
referring to the record or array becomes more protected than
the ?eld or element. For example, a record ?eld with is
immutable, since otherwise information could be leaked by
assigning to it.
5.5 and
The rules for and are similar to each other. Assume
that is a legal boolean expression, and that is an arbitrary
statement. The statement “ ” is legal if is
legal given the basic block label B . The same condition
guarantees the legality of “ ”. The label
does not need to be part of the basic block label after
the execution of the or statement, because we are
not considering covert timing channels or covert channels
arising from non-termination, as discussed in Section 4.2.
5.6 Authority
A procedure executes with some authority that has been
granted to it. Authority may be granted through the prin-
cipal hierarchy or because the procedure’s caller grants the
procedure the right to act for other principals.
At any given point within a program, the compiler under-
stands the code to be running with the ability to act for some
set of principals, which we call the effective authority of the
code at that point. The effective authority can never exceed
the true authority at any point during execution.
When a procedure starts running, it has no effective au-
thority. It may increase its effective authority by testing
whether it has the authority to act for a principal. If the test
succeeds, the effective authority is increased to include that
principal. This test is accomplished by using the
statement:
if acts for
1 2
then
1
else
2
end
(The brackets around the clause indicate that it is op-
tional.)
In this statement,
2
names a principal in the principal
hierarchy;
1
names the current procedure or the special
keyword . If it names the current procedure, it means
the procedure’s principal, as discussed in Section 5.1. If it
names , it denotes the principal(s) that the procedure’s
caller has granted it the right to act for, as discussed later in
Section 5.11.
The effect of is to execute the block if the
speci?ed acts-for relationship exists. If the test
fails, the block, if any, is executed with no additional
authority. If the test succeeds, the effective authority of the
block is increased to include
2
.
5.6.1 Revocation
It is possible that while a procedure is executing the
part of an statement, the principal hierarchy may
change in a way that would cause the test in the statement
to fail. In this case, it may be desirable to revoke the code’s
permission to run with that authority, and we assume the
underlying system can do this, by halting the code’s process,
at some point after the hierarchy changes.
If a running programis killed by a revocation, information
may be leaked about what part of the program was being
executed. This constitutes a timing channel, and one that can
be made slow enough that it is impractical to use.
5.7 Declassi?cation
A program can explicitly declassify a value. The operation
relabels the result of an expression with the label , using
relabeling Rules 1 and 2 as needed.
Declassi?cation is checked statically, using the effective
authority at the point of declassi?cation; the authorization
for declassi?cation must derive from a containing
control structure. Declassi?cation is legal as long as per-
mits declassi?cation to , which implies the following rule.
Let be a label in which every principal in the effective
authority is an owner, but with an empty reader set. The
11
most restrictive label that could have and still be declassi-
?able to is , so the expression is legal
if . For example, if the principal is part of
the effective authority, the label A: B,C; D: E can be de-
classi?ed to A: C; D: E , since A: C; D:E A: =
A: ; D: E , which is more restrictive than A: B,C; D: E .
5.8 Label Polymorphism
Consider a library routine such as the cosine function ( ).
It would be infeasible to declare separate versions of for
every label in the system. Therefore, we allow procedures
to be generic with respect to the labels on their arguments,
which means that only one function need exist.
If a label is omitted on the type of a procedure argument
, the argument label becomes an implicit parameter to the
procedure, and may be referred to as elsewhere in the pro-
cedure signature and body. For example, the cosine function
is declared as follows:
returns
is generic with respect to the label on the argument ,
and is an implicit argument to the routine.
This signature allows to be used on any argument, and
the label on the return value is always the same as the label
on the argument. Since the code of does not depend
on what really is, its code need not access the label, so
there is no need either to recompile the code for each distinct
label or to pass the label at runtime. Therefore, implicit label
polymorphism has no run-time overhead.
5.9 Run-time Labels
Implicit labels allow code to be written that is generic with
respect to labels on arguments. However, sometimes more
power is needed: for example, tomodel the accounts database
of the bank example, where every customer account has a
distinct label. To allow such code to be written, we support
run-time labels.
A variable of type may be used both as a ?rst-
class value and as a label for other values. For example,
procedures can accept arguments with unknown labels, as in
the following procedure declaration:
To simplify static analysis, ?rst-class label variables are
immutable after initialization. When a label variable is used
as a label, it represents an unknown but ?xed label. Because
labels form a lattice and obey the simple rules of a lattice,
static reasoning about this label is straightforward. For ex-
ample, if the procedure contains the assignment
, where has type , the assignment is valid
as long as it can be statically determined that . This
condition can be checked even when and do not declare
their labels explicitly, as discussed in Section 6.
Since is a type, it too must be labeled wherever it
is used. Constant labels and implicit label parameters have
type . Declarations of run-time labels can indicate
the label’s label explicitly; if the label’s label is omitted, it
treated like any other type: it is inferred in the case of a local
variable, and it is implicit in the case of an argument (this is
the situation for the argument of ).
In principle, code that is written in terms of implicit labels
can be expresssed in terms of run-time labels. We provide
implicit labels because when they provide adequate power,
they are easier and cheaper to use than run-time labels. For
example, without implicit labels the signature of the
function would be the following:
returns
5.10 Labelcase
The statement is used to determine the run-time
label on a value. It effectively allows a program to examine
variables of type . For example, might use
to match the label against other labels it has
available. A labelcase statement has the following form:
labelcase as
when
1
do
1
when
2
do
2
else
3
end
The effect of this statement is to execute the ?rst statement
such that , introducing a new variable containing
the same value as , but with the label .
The block label in the armof the does not re?ect
the label of , but it does re?ect the label of label. Suppose
the type of is where has the type . Sim-
ilarly, suppose the labels have respective types .
The labelcase statement is valid only if each of the statements
is valid given a basic block label :
B
1
This formula says that selecting which statement to execute
reveals information about the labels, but does not reveal any-
thing about information protected by the labels. Therefore,
the basic block labels depend on but not on . The
reason for joining all the up to is that the arms of the
12
are checked sequentially, so each arm that fails
conveys some information to its successors.
In a , neither the tested expression nor the la-
bels on any of the arms may mention the implicit labels
on procedure arguments; rather, the is limited to
constant labels and run-time labels. Implicit procedure argu-
ment labels are only intended to help write code that doesn’t
care what the labels are on the arguments. This restriction
on the avoids the need to pass implicit labels to
the generic code that uses them, as discussed in Section 5.8.
Since labels may be passed as explicit arguments, and values
of type may be used to label types, no power is lost
through this restriction.
5.11 Procedures
A procedure de?nition has the following syntax:
procedure id authority
( arguments )
returns ( id : )
body end
authority caller
arguments
1
:
1 1
:
The optional authority clause indicates whether the proce-
dure may be granted some authority by its caller. Aprocedure
with an authority clause may try to claim this authority by
using in an statement. The are the names
of the arguments to the procedure. The arguments have types
and optional labels . As in variable declarations, labels
in the arguments and results of a procedure signature may be
simple label expressions, including joins of other labels.
A call to a procedure is legal only if each actual argument
to the procedure is assignable to the corresponding formal
argument. This means that the formals must have labels that
are more restrictive than the block label of the caller at the
time of the call, i.e., the normal rule for assignment given
in Section 5.4 applies here. Additionally, bindings must
be found for all the implicit label parameters such that for
each explicit or implicit formal argument label and actual
argument label , the constraint B holds, where
B denotes the basic block label of the call site. Determining
the bindings for the implicit labels that will satisfy all these
constraints is not trivial, since the formal argument labels
may be join expressions that mention other implicit argument
labels, as in the signature , where
is some other label. The ef?cient solution of such constraint
systems is considered in Section 6.
Furthermore, if the authority clause is present, the caller
may provide one or more principals that it acts for. Such a call
type
returns
Create a new containing
the value and label .
returns
Return the label that is contained in
returns
Return the value that is contained in if
the label matches the contained label.
Set accordingly.
Figure 7: The operations of
can occur only within the scope of an statement,
since otherwise the effective authority of the caller is nil. For
example, the following call from the procedure grants the
authority of the principal to the procedure :
if acts for then
end
This model for granting authority protects the caller of
a procedure because it can select what part of its authority
to grant to the procedure. The implementor of the called
procedure is also protected, because the procedure uses only
the authority its implementor wishes to claim, and only where
needed. (This is similar to the CACL model [RSC92], but
provides more protection for the implementor.)
5.12
Run-time label checking is conveniently accomplished by
using the special type . A is an
immutable object that contains two things: a value of type
, and a label that protects the value.
The type is particularly useful for imple-
menting structures like ?le systems, where the information
?ow policies of individual objects cannot be deduced stati-
cally. For example, a ?le system directory is essentially an
array of ?le objects. Navigating the ?le system
requires a series of run-time label checks, but this work is
unavoidable.
has two methods: , which extracts the
contained value, and , which extracts the contained
label. It also has a constructor, , which creates a new
protected value. The signatures of these operations are shown
in Figure 7. The method requires an expected label as
an argument. The value is returned only if the expected
13
label is at least as restrictive as the contained label, which is
determined at run time. The expected label must be either
a constant label or a variable label; implicit labels are not
allowed.
If the language is extended to support some form of data
abstraction with encapsulation, the type can be im-
plemented in a natural manner by using the state-
ment. Without these extensions, the closest approximation
to is the following type:
immutable
Like this type, the type has the special prop-
erty that the label on the reference to a (the label
) is assumed to be the label on the contained label, . This
constraint can be maintained because is im-
mutable, and because
1 2
implies
1
2
.
A allows information ?ow via its contained
label: one could pass information by storing different labels
in objects, and then seeing whether oper-
ations succeed. However, the signature of prevents an
information leak because the result is labeled both
by the label on the label being passed in ( ), and by ’s label
( ), which is the same as the label on the contained label.
Therefore, this information ?ow does not create a leak.
5.13 The Bank Example
The bank example from Section 2 is a good example of the
need for run-time labels. Each customer account is owned
by the individual customer, rather than by the bank, which
gives the customer more con?dence in the privacy offered by
the bank. This example can be conveniently written using
to protect each customer’s account with a label
speci?c to the customer, as shown in Figure 8.
For example, the customer’s account can be a simple
record type, where the customer’s name is protected by (and
accessible to) the bank, but balance is protected by both
the bank label and a customer label that is stored inside the
. This design gives the customers protection
against dissemination of their balance information.
In this code, represents a principal with the ability
to declassify the label. The current balance of an
account is obtained by the procedure , which ac-
cepts a ?rst-class label as an argument. If the label
that is passed to is at least as restrictive as the
label in the account, the balance is returned as a ?oat labeled
by . The procedure fails either if no customer ex-
ists by that name, or if the label passed in is insuf?ciently
protected.
returns
?nd element containing the right customer
if acts for then
end
end
Figure 8: Bank Example
5.14 Output Channels
Output channels show up as the special opaque type “
” in the language. The type denotes an output
channel with a hidden reader set, whose members denote the
principals who are reading from the channel. Information
can be written to a channel using the “ ” procedure:
When a write is attempted, the label is compared against
the hidden reader set within the channel. If it passes the tests
of the effective reader set that are described in Section 3.5,
the write is successful. Otherwise, it silently fails.
It is important that capture all the possible information
?ows that the write will cause, since otherwise would
not perform a suf?ciently stringent test against the channel’s
reader set. Because is used to label , the channel cannot
leak information through the contents of the data that is sent
out. Because is used to label the argument as well,
the channel cannot be used to leak information by choosing
among a set of channels to write to. Finally, because
labels itself, the channel cannot be used to leak information
by changing the label that is passed in as the argument, and
transmitting information by the fact of the write’s success or
failure.
6 Veri?cation and Label Inference
This section describes code veri?cation and label inference
in more detail. It demonstrates that basic block labels, labels
14
on local variables, and labels in expressions can be
inferred ef?ciently and automatically, making it much more
convenient to write properly annotated code. We present a
small example of this process.
6.1 Label Constraint Systems
As described earlier, procedure veri?cation is a two-step pro-
cess that ?rst determines basic block labels by propagat-
ing the labels of branch-determining expressions, and then
checking all the statements in the program in the context of
their basic blocks. While the expressions that control branch
decisions can be identi?ed statically, their labels depend on
the label of the basic block in which they occur. We seem to
have a chicken-and-egg problem.
The dif?culty can be resolved by creating a label con-
straint system and solving for all the inferred labels simulta-
neously. We start by inventing fresh labels for each of the
branch expressions in blocks . These labels are propagated
according to the rules for determining basic block labels, and
fed into the statement veri?cation stage. Verifying each
statement requires checking of corresponding constraints in-
volving labels. To verify code, we collect all the constraints
that are demanded by the various statements, and solve them
as a system. If the constraints are inconsistent and generate
a contradiction, the program contains an information leak.
Solving the constraint system is tractable because the
constraints all take a particularly simple form. Each con-
straint has either a label or a join of labels on both the left-
and right-hand sides, e.g.,
1 2 3 4
. This equa-
tion is equivalent to the two equations
1 3 4
and
2 3 4
, and in general, we can convert the constraints
to a canonical form in which there is only one label (variable
or constant) on the left-hand side.
Explicitly declared labels (whether constants or ?rst-class
labels) and implicit argument labels are treated as constants;
basic block labels and undeclared labels on local variables
are treated as variables. The goal of veri?cation is to de-
termine whether there are assignments to the local variable
labels and basic block labels that satisfy all the constraints.
This problem is similar in form to the problem of satisfying
propositional Horn clauses; in fact, a linear-time algorithm
for satisfying Horn clauses [DG84, RM96] can be adapted
easily to this problem. If, on the other hand, we had permitted
use of both the and operators in constructing label ex-
pressions, the label satisfaction problem would have become
NP-complete [RM96].
The algorithm works by keeping track of conservative
upper bounds for each unknown label. Initially, all the upper
bounds are set to . The algorithmthen iteratively decreases
the upper bounds, until either all equations are satis?ed or
a contradiction is observed. At each step, the algorithm
picks an equation that is not satis?ed when all variables are
substituted by their upper bounds. If the unsatis?ed equation
returns
Return whether is the password of
while do
1
if
1
then
1 2
end
1
end
if acts for
end
end
Figure 9: Password example with implicit labels
has a constant label on its left-hand side, a contradiction has
been detected. Otherwise, the upper bound estimate for the
variable label on the left-hand side is adjusted to be the meet
( ) of its current upper bound and the value of the right-
hand side. In evaluating the right-hand side, all variables are
replaced with their current upper bound estimates.
Like the algorithm for satisfying Horn clauses, this al-
gorithm requires a number of iterations that is linear in the
total size of the constraints; the total size of the constraints is
at worst quadratic in the length of the code. Therefore, this
inference algorithm seems very practical.
The labels found by this algorithmare the most restrictive
labels that satisfy the constraints. However, the actual values
that the inference algorithm?nds are irrelevant, because they
are never converted to ?rst-class values of type . What
is important is that there is a satisfying assignment to all the
labels, proving that the code is safe.
6.2 Inference Example
Figure 9 shows the code for a more ?exible version of the
procedure that was presented earlier. This
version of is usable by any client principal.
Because the arguments , , and have no declared
labels, their labels are implicit parameters to the routine.
Note that the local variables and do not explicitly
declare their labels. The resulting procedure is as safe as the
previous version of , and easier to implement
and use. Let us now walk through the veri?cation process
for this code.
The ?rst step in veri?cation is to construct the basic-block
diagram and propagate fresh labels that represent branch
15
i := 0
match := F
while =
1
if
:
1
=
2
match := T
1 2
i := i + 1
1
ret := F
declassify :
ret :=
Figure 10: Constraints for the password example
1 2
:
Figure 11: Constraint solutions
expressions. The comments in Figure 9 show the value of
the basic block labels for each basic block, in terms of the
two branch-expression labels
1
and
2
(for the and ,
respectively.)
Next, the code is analyzed to generate the set of label
constraints shown in Figure 10, which include inequalities
corresponding to the statements in the program, plus some
equalities that bind the basic-block branch-expression labels
to the labels for the corresponding expressions. The equali-
ties can be transformed into a pair of constraints to preserve
the canonical constraint form. Note that the use of
generates an additional constraint, introducing a new vari-
able that represents the label of the declassi?ed result.
This procedure provides a good example of why it is im-
portant for declassi?cation to be able to apply to just one
component of a join, as discussed in Section 3.4. The fact
that declassi?cation works at all in this procedure, let alone
is possible to verify automatically, is due to this property of
the declassi?cation rule.
Applying the constraint-solving algorithmjust described,
a single backward pass through the canonical forms of these
constraints yields labels that satisfy them, as shown in Fig-
ure 11.
7 Related Work
There has been much work on information ?ow control and
on the static analysis of security guarantees. The lattice
model of information ?owcomes fromthe early work of Bell
and LaPadula[BL75] and Denning [Den76]. More recent
work on information ?ow policies has examined complex
aggregation policies for commercial applications [CW87,
BN89, Fol91]. We have not addressed policies that capture
con?icts of interest, though our ?ne-grained tracking of own-
ership information seems applicable. Many of these infor-
mation control models use dynamic labels rather than static
labels and therefore cannot be checked statically. IX [MR92]
is a good example of a practical information ?owcontrol sys-
tem that takes this approach. Our propagation of ownership
information is also reminiscent of models of access control
that merge ACLs at run time [MMN90].
Static analysis of security guarantees also has a long
history. It has been applied to information ?ow [DD77,
AR80] and to access control [JL78, RSC92]. There has
recently been more interest in provably-secure program-
ming languages, treating information ?ow checks in the do-
main of type checking [VSI96, Vol97]. Also, integrity con-
straints [Bib77] have been treated as type checking [PO95].
We have avoided considering covert channels arisingfrom
time measurement and thread communication. A scheme
for statically analyzing thread communication has been pro-
posed [AR80]; essentially, a second basic block label is
added with different propagation rules. This second label
is used to restrict communication with other threads. The
same technique would remove timing channels, and could
be applied to our scheme. It is not clear how well this
scheme works in practice; it seems likely to restrict timing
and communication quite severely. Static side-effect and
region analysis [JG91], which aims to infer all possible side-
effects caused by a piece of code, may be able to capture
effects like timing channels.
8 Conclusions
This work was motivated by a desire to provide better se-
curity when using downloaded applications, by making ?ne-
grained information ?owcontrol practical and ef?cient. This
paper is a ?rst step towards this goal.
A key limitation of most multilevel security models is
that there is a single, centralized policy on how information
can ?ow. This kind of policy does not match well with the
decentralized model in which each user is independent. Our
model provides each user the ability to de?ne information
?ow policy at the level of individual data items. Each user
has the power to declassify his own data, but declassi?cation
does not leak other users’ data.
An important aspect of our label model is that labels
identify the owners, or sources, of the data. Each owner
listed in the label maintains its own separate annotation (the
reader set) to control where its data may ?ow. When running
on behalf of a particular principal P, a program is able to edit
P’s portion of the label without violating other principals’
policies. Labels form a familiar security class lattice, but
each principal has access to some non-lattice relabelings (by
declassifying).
We have also shown how the labels we de?ne can be
used to annotate a simple programming language, which
16
suggests that other programming languages and intermediate
code formats can be similarly annotated. To ensure proper
information ?ows, the labels in the resulting code can be
statically checked by the compiler, in a manner similar to
type checking. Also, as part of the label checking process,
the compiler can construct a trace of its checking process.
Some form of this trace can accompany the generated code,
and can be used later to verify information ?ows in the code.
Labels are mostly checked statically, which has bene?ts
in space and time. Compiled code does not need to perform
checks, so the code is shorter and faster than with run-time
checks. Storage locations that are statically typed require no
extra space to store a label. However, we have also de?ned a
mechanism for run-time label checking that allows the ?ex-
ibility of run-time checks when they are truly needed. This
mechanism guarantees that when the run-time checks fail,
information is not leaked by their failure. The mechanism of
implicit label polymorphism also extends the power of static
analysis, by allowing the de?nition of code that is generic
with respect to some or all of the labels on its arguments.
We have presented a simple algorithm that, despite our ex-
tensions to the label system, is able to ef?ciently infer labels
for basic blocks and local variables. Label inference makes
the writing of label-safe code signi?cantly less onerous.
We have provided some simple examples of code and
software designs that cannot be adequately expressed using
previous models of access control or information ?ow. These
examples demonstrate that the newfeatures of user declassi?-
cation, label polymorphism, and run-time label checking add
new power and expressiveness in capturing security policies.
9 Future Work
There are many directions to explore with this new model.
An obvious next step is to implement the model by extend-
ing an existing language compiler, and developing working
applications for examples like those in Section 2.
It should also be possible to augment the Java Virtual
Machine [LY96] with annotations similar to those proposed
in Section 5. The bytecode veri?er would check both types
and labels at the time that code is downloaded into the system.
The computational model described in Section 5 has a
reasonable set of data types. However, it ought to support
user-de?ned data abstractions, including both parametric and
subtype polymorphism.
Formal proofs of the soundness of the model might add
some con?dence. In the absence of any use of the
operation, labels are located in a simple lattice that applies
equally to all users, and previous results for the security of
lattice-based information ?ow models apply to this model as
well. However, because is intended to allowinfor-
mation to ?ow across or down the lattice, standard security
policies such as non-interference [GM84] are intentionally
inapplicable.
Because integrity [Bib77] constraints have a natural lat-
tice structure, supporting them may be an interesting exten-
sion to our label model; the label model can be augmented
to allow each owner to establish an independent integrity
policy, just as each owner now can establish an independent
information ?ow policy.
We have assumed an entirely trusted execution environ-
ment, which means that the model described here does not
work well in large, networked systems, where varying levels
of trust exist among nodes in the network. Different prin-
cipals may also place different levels of trust in the various
nodes. A simple technique for dealing with distrusted nodes
is to transmit opaque receipts or tokens for the data. How-
ever, more work is needed to adapt our model to this kind
of system. It is also likely that the model of output channels
should be extended to differentiate among the different kinds
of outputs from the system.
We have not considered the possibility of covert channels
that arise fromtiming channels and fromasynchronous com-
munication between threads, which can also be used to create
timing channels. The technique of having a timing label for
each basic block, as in Andrews and Reitman [AR80], may
help with this problem, but more investigation is needed.
Acknowledgments
The authors would like to acknowledge the helpful com-
ments of the many people who have read this paper, in-
cluding Mart´?n Abadi, Atul Adya, Kavita Bala, Phil Bogle,
Miguel Castro, Steve Garland, Robert Grimm, Butler Lamp-
son, Roger Needham, Matt Stillerman, and the anonymous
reviewers.
References
[AR80] Gregory R. Andrews and Richard P. Reitman.
An axiomatic approach to information ?ow in
programs. ACM Transactions on Programming
Languages and Systems, 2(1):56–76, 1980.
[Bib77] K. J. Biba. Integrity considerations for secure
computer systems. Technical Report ESD-TR-
76-372, USAF Electronic Systems Division,
Bedford, MA, April 1977.
[BL75] D. E. Bell and L. J. LaPadula. Secure computer
system: Uni?ed exposition and Multics inter-
pretation. Technical Report ESD-TR-75-306,
MITRE Corp. MTR-2997, Bedford, MA, 1975.
Available as NTIS AD-A023 588.
[BN89] D. F. Brewer and J. Nash. The Chinese wall
security policy. In Proc. of the IEEESymposium
17
on Security and Privacy, pages 206–258, May
1989.
[CW87] David Clark and David R. Wilson. A compari-
son of commerical and military computer secu-
rity policies. In Proc. of the IEEE Symposium
on Security and Privacy, pages 184–194, 1987.
[DD77] Dorothy E. Denning and Peter J. Denning. Cer-
ti?cation of programs for secure information
?ow. Comm. of the ACM, 20(7):504–513, 1977.
[Den76] Dorothy E. Denning. A lattice model of se-
cure information ?ow. Comm. of the ACM,
19(5):236–243, 1976.
[DG84] William F. Dowling and Jean H. Gallier. Linear-
time algorithms for testing the satis?ability of
propositional Horn formulæ. Journal of Logic
Programming, 1(3):267–284, October 1984.
[Fol91] Simon N. Foley. A taxonomy for information
?ow policies and models. In Proc. of the IEEE
Symposium on Security and Privacy, pages 98–
108, 1991.
[GJS96] James Gosling, Bill Joy, and Guy Steele. The
Java Language Speci?cation. Addison-Wesley,
August 1996. ISBN 0-201-63451-1.
[GM84] J. A. Goguen and J. Meseguer. Unwinding and
inference control. In Proc. of the IEEE Sym-
posium on Security and Privacy, pages 11–20,
April 1984.
[JG91] Pierre Jouvelot and David K. Gifford. Algebraic
reconstruction of types and effects. In ACM
Symposiumon Principles of Programming Lan-
guages, pages 303–310, January 1991.
[JL78] Anita K. Jones and Barbara Liskov. A language
extension for expressing constraints on data ac-
cess. Comm. of the ACM, 21(5):358–367, May
1978.
[LAB 84] Barbara Liskov, Russell Atkinson, Toby Bloom,
J. Eliot Moss, J. Craig Schaffert, Robert Schei-
?er, and Alan Snyder. CLU Reference Manual.
Springer-Verlag, 1984. Also published as Lec-
ture Notes in Computer Science 114, G. Goos
and J. Hartmanis, Eds., Springer-Verlag, 1981.
[LABW91] Butler Lampson, Mart´?n Abadi, Michael Bur-
rows, and Edward Wobber. Authentication in
distributed systems: Theory and practice. In
Proc. 13th ACM Symp. on Operating System
Principles (SOSP), pages 165–182, October
1991. Operating System Review, 253(5).
[Lam73] Butler W. Lampson. A note on the con?ne-
ment problem. Comm. of the ACM, 10:613–615,
1973.
[LY96] T. Lindholm and F. Yellin. The Java Virtual
Machine. Addison-Wesley, Englewood Cliffs,
NJ, May 1996.
[MMN90] Catherine J. McCollum, Judith R. Messing, and
LouAnna Notargiacomo. Beyond the pale of
MACand DAC—de?ning newforms of access
control. In Proc. of the IEEE Symposium on
Security and Privacy, pages 190–200, 1990.
[MR92] M. D. McIlroy and J. A. Reeds. Multilevel secu-
rity in the UNIX tradition. Software—Practice
and Experience, 22(8):673–694, August 1992.
[Nec97] George C. Necula. Proof-carrying code. In
Proc. of ACM Symp. on Principles of Program-
ming Languages, pages 106–119, January 1997.
[PO95] Jens Palsberg and Peter Ørbæk. Trust in the -
calculus. In Proc. 2nd International Symposium
on Static Analysis, number 983 in Lecture Notes
in Computer Science, pages 314–329. Springer,
September 1995.
[RM96] Jakob Rehof and Torben Æ. Mogensen. Trac-
table constraints in ?nite semilattices. In Proc.
3rd International Symposiumon Static Analysis,
number 1145 in Lecture Notes in Computer Sci-
ence, pages 285–300. Springer-Verlag, Septem-
ber 1996.
[RSC92] Joel Richardson, Peter Schwarz, and Luis-
Felipe Cabrera. CACL: Ef?cient ?ne-grained
protection for objects. In Proceedings of the
1992 ACMConference on Object-Oriented Pro-
gramming Systems, Languages, and Applica-
tions, pages 154–165, Vancouver, BC, Canada,
October 1992.
[Vol97] Dennis Volpano. Provably-secure programming
languages for remote evaluation. ACM SIG-
PLAN Notices, 32(1):117–119, January 1997.
[VSI96] Dennis Volpano, Geoffrey Smith, and Cyn-
thia Irvine. A sound type system for secure
?ow analysis. Journal of Computer Security,
4(3):167–187, 1996.
18

doc_747054732.pdf

White Paper on Decentralized Model for Information Flow Control

Attachments