Text Version of File Standard for Flow Cytometry, FCS3.0

Dave Coder (dave@nucleus.immunol.washington.edu)
Wed, 28 Aug 96 13:11:42 -0700

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: T. Vincent Shankey: "Re: ploidy analysis"
Previous message: L_DIAZ@rcmad.upr.clu.edu: "lysis sol."

Below is the complete text version of FCS 3.0 that appears on the ISAC WWW
Home Page. (Please note copyright notice at end of document. Fair use
provisions of US copyright law permit limited reproduction for scholarly use.
Other uses may require permission of the copyright holder, ISAC.)

Dave Coder,
Editor, ISAC WWW Home Page
http://nucleus.immunol.washington.edu/ISAC.html
dcoder@u.washington.edu

_________________________________________________________________________
Data File Standard for Flow Cytometry, Version FCS3.0
(copyright 1996 International Society for Analytical Cytology)

Data File Standards Committee of the International Society for Analytical
Cytology (ISAC)

ABSTRACT

The flow cytometry data file standard provides the specifications needed to
completely describe flow cytometry data sets within the confines of the file
containing the experimental data. In 1984 the first Flow Cytometry Standard
format for data files was adopted as FCS1.0, this standard was modified in
1990 as FCS2.0. We report here on the proposed next generation Flow Cytometry
Standard data file format, FCS3.0. The principal goal of the Standard is to
provide a uniform file format allowing files created by one type of
acquisition hardware and software to be analyzed by another type. The
proposed FCS3.0 standard maintains backwards compatibility with previous
versions by retaining the basic FCS file structure. The FCS structure
requires that each data set in a file contains three segments: HEADER, TEXT
and DATA, with an optional ANALYSIS segment. The HEADER appears first and
contains plain text byte offsets needed to locate the other segments. The
TEXT segment contains plain text keyword-value pairs that describe the
experiment, the instrument, the specimen, the data and any other information
which the file creator wishes to include. The DATA segment contains the
actual FCM data in one of several formats specified in the TEXT segment. The
ANALYSIS segment contains plain text keyword-value pairs that describe
user-specified analyses of the data. The proposed changes in FCS3.0 include:
a mechanism for handling data sets of 100 megabytes and larger, support for
UNICODE text for keyword values, support for cyclic redundancy check (CRC)
validation for each data set, a requirement for the inclusion of information
describing the method of signal amplification and increased support for the
inclusion of time as a measurement parameter.

INTRODUCTION

The goal of the Flow Cytometry Data File Standard is to facilitate the
development of software for reading and writing flow cytometry data files in
a standardized format. Application of a standard file format allows files
created on one type of instrument to be read and analyzed by software
implemented on a different computer. The original FCS standard was published
in 1984 as FCS1.0 (1) and amended in 1990 as FCS2.0 (2).

The changes included in FCS3.0 were made necessary by rapid evolution in
microcomputer technology, computer communications, instrument design and
experimental complexity. These technological advances have resulted in an
increase in the average data file size straining the limit of 99,999,999
bytes per data set designed into previous FCS versions. FCS3.0 provides a
mechanism to avoid this restriction while retaining backwards compatibility
for those data files which do not exceed the 100-megabyte limit. The growth
of computer networks has resulted in the routine movements of large amounts
of data between computers. This has created the need for a means of
confirming file integrity. Therefore, FCS3.0 provides support for a cyclic
redundancy check (CRC) word to be placed at the end of each FCS3.0 data set.
The use of a CRC check word allows errors, occurring during file transfer or
reading, to be detected. Time is increasingly used as a measurement
parameter. Therefore, keyword support has been added to better describe the
acquisition of time. Internationalization in the field of flow cytometry has
caused a need for the incorporation of international characters in keyword
values. Therefore, a provision has been made to support the use of multi-byte
characters for some strings by providing a keyword to support the UNICODE
character set (3).

Table of Contents

1. General
1.1 Scope
1.2 Conformance
2. Terminology and General Requirements
2.1 Conventions
2.2 Definitions
2.3 General Concepts
3. File Segments
3.1 HEADER Segment
3.2 TEXT Segment
3.3 DATA Segment
3.4 ANALYSIS segment
3.5 CRC Value
3.6 Other Segments
4. References
5. Appendices
5.1 Appendix A - Differences from FCS2.0
5.2 Appendix B - Data File Standard Committee Members

1. General

1.1 Scope

This is version 3.0 of the Flow Cytometry Data File Standard (FCS3.0). Its
purpose is to provide detailed specifications for the structure of the data
sets produced as a result of acquiring data on a cytometer and writing the
data to a file.

1.2 Conformance

To be conformant with FCS3.0, a data file must conform to the file structure
as described in this document and must contain all required keyword-value
pairs in the primary TEXT segment of the file. A conformant file must not
contain other segments not described in the data set HEADER segment. To be
conformant with FCS3.0 an analysis program must be able to correctly read and
interpret all of the data contained in any minimum FCS3.0 conformant file (a
minimum FCS3.0 conformant file is one with only the required keyword-value
pairs in the TEXT segment of the file and no information in the ANALYSIS
segment).

2. Terminology and General Requirements

2.1 Conventions

2.1.1 The ASCII character code is used for all keywords and most of the
keyword values throughout an FCS3.0 file (see section 3.2.20 regarding the
use of UNICODE characters).

2.1.2 Numerical values are base 10 unless otherwise specified.

2.2 Definition

2.2.1 An FCS3.0 data file consists of one or more data sets.

2.2.2 A data set is defined as the collection of information produced by a
cytometer as it carries out its measurements on some number of particles.

2.2.3 The collection of information in a data set is divided into at least
four segments including a HEADER segment, a primary TEXT segment, a DATA
segment, and an ANALYSIS segment. The ANALYSIS segment may be empty and any
number of implementor-defined segments may follow the first four segments.
New to FCS3.0 is the inclusion of an optional supplemental TEXT segment.

2.2.4 The HEADER segment identifies the data set as FCS3.0 and contains ASCII
byte offsets from the start of the data set to the beginning and end of each
of the other segments.

2.2.5A keyword is the label of a data field. A keyword-value pair is the
label of the data field with its associated value.

2.2.6 The TEXT segments contain a series of ASCII keyword-value pairs that
describe the format of the DATA segment and most of the experimental
operating conditions. The primary TEXT segment contains all required
keyword-value pairs. The supplemental TEXT segment contains optional
keyword-value pairs only.

2.2.7 The DATA segment contains either a list of the events or histograms of the data.

2.2.8 An event is an ordered list of the cytometric measurements for one
particle. The length of an event is the number of parameters involved in the
measurement.

2.2.9 A parameter is the signal produced by one of the detectors of the
cytometer. Forward scattering is typically one of the measurement parameters.
A parameter value is a digital representation of a parameter.

2.2.10 Each data set in a data file contains all the information needed to
read and interpret the data set.

2.2.11 All space within a file which is not contained in a segment specified
in the HEADER must be filled with a space character (ASCII 32). This includes
unused space between the end of one segment and the beginning of the next
segment and between the end of the last data set and the end of the file.

2.2.12 List mode data storage means that events are stored one after the
other in a list.

2.2.13 All byte offsets are referenced to the beginning of the data set. The
first data set in a file begins at byte zero of the file.

2.2.14 The implementor is the entity that creates the software to read and
write FCS conformant data files.

2.2.15 The "delimiter" is the first character of the primary TEXT segment and
is subsequently placed in the primary TEXT, supplemental TEXT and ANALYSIS
segments to separate keywords from keyword values. The delimiter can be any
ASCII character.

2.3 General Concepts

An FCS3.0 file is composed of one or more data sets, each containing at a
minimum HEADER, TEXT and DATA segments. The HEADER, TEXT, and ANALYSIS
segments contain ASCII-encoded text readable by a text editor (some
keyword-values may contain UNICODE characters; See section 3.2.20). The DATA
segment contains flow cytometry data stored in list mode or as histograms.

3. File Segments

3.1 HEADER Segment

3.1.1 The primary purpose of the HEADER segment is to describe the location
of the other segments in the data set. The HEADER segment begins at byte
offset zero from the beginning of the data set. The first six bytes in the
HEADER segment comprise the version identifier (FCS3.0). Note, there is no
space character between the FCS and the 3.0 in the identifier. The next 4
bytes (6 - 9) are occupied by space characters (ASCII 32). Following the
identifier are at least three pairs of ASCII-encoded integers indicating the
byte offsets for the start and end of the primary TEXT segment, the DATA
segment, and the ANALYSIS segment, respectively. The byte offsets are
referenced to the beginning of the data set. Under FCS3.0 these offsets
remain limited to 8 bytes. Each ASCII encoded integer offset is right
justified in its 8 byte space. The first byte offset (bytes 10 - 17) is that
to the start of the primary TEXT segment. The next byte offset (bytes 18 -
25) is that for the end of the primary TEXT segment. The next offset (bytes
26 - 33) is that for the start of the DATA segment. The byte offset for the
end of the DATA segment occupies bytes 34 - 41. That for the start of the
ANALYSIS segment occupies bytes 42 - 49. The byte offset for the end of the
ANALYSIS segment is in bytes 50 - 57. If there is no ANALYSIS segment these
last two byte offsets can be set to zero (right justified) or left blank
(filled with space characters). Offsets to the start and end of user-defined
OTHER segments of the data set follow the ANALYSIS segment offsets. The
user-defined segments will not be interpretable by others unless appropriate
information is passed on by the data set originator.

A major change from previous FCS versions is the allowance for data sets
larger than 99,999,999 bytes. When any portion of a segment falls outside the
99,999,999 byte limit, '0's are substituted in the HEADER for that segments
begin and end byte offset. The byte offsets for begin DATA, end DATA, begin
ANALYSIS, end ANALYSIS (begin and end supplemental TEXT if appropriate) will
then only be found as keyword-value pairs in the primary TEXT segment. Note,
when a segment is contained completely within the first 99,999,999 bytes of a
data set, the byte offsets for that segment will be duplicated in the TEXT
segment as keyword values. Note also, if the ANALYSIS offsets in the HEADER
are zero, the $BEGINANALYSIS and $ENDANALYSIS keywords must be checked to
determine if an ANALYSIS segment is present.

Table 1. Contents of HEADER fields and the byte offsets to the beginning and
end of each field. Each offset is right justified in its field.

Contents Start and end byte positions

FCS3.0 00 - 05

ASCII(32) - space characters 06 - 09

ASCII-encoded offset to first byte of TEXT segment 10 - 17

ASCII-encoded offset to last byte of TEXT segment 18 - 25

ASCII-encoded offset to first byte of DATA segment 26 - 33

ASCII-encoded offset to last byte of DATA segment 34 - 41

ASCII-encoded offset to first byte of ANALYSIS segment 42 - 49

ASCII-encoded offset to last byte of ANALYSIS segment 50 - 57

ASCII-encoded offset to user defined OTHER segments 58 - beginning of next segment

One example HEADER segment is as follows:

FCS3.0*********256****1545****1792**202456*******0*******0

The '*' character is used to represent a space character here. The TEXT
segment starts at byte 256 from the location of the 'F' in FCS3.0 and ends at
byte offset 1545. The DATA segment starts at byte offset 1792 and ends at
202456. There is no ANALYSIS segment, so the start and end offsets are shown
as zeros. They could be left blank. Note that the HEADER segment is a
continuous byte stream with no return or line feed characters. The bytes
between the end of the HEADER segment and the start of the next segment must
be filled with the space character. In this example, the segments are in the
order HEADER, TEXT, DATA, and ANALYSIS. The FCS standard requires only that
the HEADER segment be at the start of the data set and the primary TEXT
segment be located entirely within the first 99,999,999 bytes.

A second example of a legal HEADER segment is:

FCS3.0*********256****1545*******0*******0*******0*******0

The '0's in the begin DATA and end DATA positions indicates that the DATA
segment exceeds the 99,999,999 byte limit. Therefore, the byte offsets to
begin Data and end Data, are located only in the $BEGINDATA, $ENDDATA keyword
values in the TEXT segment. The begin ANALYSIS and end ANALYSIS byte offsets
are also located in the $BEGINANALYSIS and $ENDANALYSIS keyword values in
TEXT segment, if an ANALYSIS segment exist.

A third example of a legal HEADER segment is:

FCS3.0******202451**203140****1792**202450*******0*******0

This HEADER is different from the other examples in that it describes a data
set in which the primary TEXT segment follows the DATA segment.

3.2 TEXT Segment

3.2.1 The TEXT segments (primary and supplemental) contain a series of ASCII
encoded keyword-value pairs that describe various aspects of the data set.
For example, $TOT/5000/ is a keyword-value pair indicating that the total
number of events in the file is 5000. $TOT is the keyword and 5000 is the
value. The '$' character flags this keyword as an standard FCS keyword. In
this example, the '/' is the delimiter character.

3.2.2 A data set must contain a primary TEXT segment which contains all
required keyword-value pairs and any number of optional keyword-value pairs.
The primary TEXT segment must be contained entirely in the first 99,999,999
bytes of data set.

3.2.3 A data set may contain an optional supplemental TEXT segment that can
contain only optional keyword-value pairs and may be placed anywhere in a
data set after the HEADER segment.

3.2.4 The byte offset to the beginning and end of the supplemental TEXT
segment is found in the $BEGINSTEXT and $ENDSTEXT keyword-value pairs which
must be located in the primary TEXT segment.

3.2.5 The first character in the primary TEXT segment is the ASCII delimiter
character. This character must also be used as the delimiter in the ANALYSIS
and supplemental TEXT segments.

3.2.6 The delimiter is placed at the start and end of a keyword value.

3.2.7 The delimiter may not be the first character in a keyword or keyword
value. If the delimiter appears in a keyword or keyword value, it must be
immediately followed by a second delimiter. For example, "$SYS/RSX-11//M/"
shows a value of RSX-11/M for the keyword $SYS. Since null (zero length)
keywords or keyword values are not permitted, two consecutive delimiters can
never occur between a value and a keyword.

3.2.8 All keywords are encoded in ASCII. Keyword values are encoded in ASCII
by default. The values of specified keywords may be in languages not
representable in ASCII by use of the $UNICODE keyword.

3.2.9 Keywords and keyword values must have lengths greater than zero.

3.2.10 Keywords are case insensitive, They may be written in a file in lower
case, upper case, or a mixture of the two. However, an FCS file reader must
ignore keyword case. A keyword value may be in lower case, upper case or a
mixture of the two. Keyword values are case sensitive.

3.2.11 There are no default values for any keyword.

3.2.12 FCS-defined keywords must begin with the '$' character. Only
FCS-defined keywords may begin with the '$' character.

3.2.13 FCS-defined keywords may not be redefined by the implementor.

3.2.14 There are required and optional FCS keyword-value pairs. The required
keyword-value pairs represent the minimum set needed to successfully read and
write an FCS data set. Conformant FCS file reading programs must recognize
required FCS keywords.

3.2.15 The TEXT segments must not contain return (ASCII 13), line feed (ASCII
10) or other unprintable characters unless they are within a keyword value or
are used as the delimiter character.

3.2.16 The parameter description keywords (e.g. $PnR, $PnB, etc) are numbered
consecutively in the order in which the parameters are written to the file,
beginning with number 1.

The required and optional FCS keywords are listed below with one line
descriptions. The keywords and their values are described in alphabetical
order following the lists. Required keywords are so indicated.

3.2.18 The required FCS primary TEXT segment keywords are as follows:

$BEGINANALYSIS Byte-offset to the beginning of the ANALYSIS segment.

$BEGINDATA Byte-offset to the beginning of the DATA segment.

$BEGINSTEXT Byte-offset to the beginning of a supplemental TEXT segment.

$BYTEORD Byte order for data acquisition computer.

$DATATYPE Type of data in DATA segment (ASCII, integer, floating point).

$ENDANALYSIS Byte-offset to the end of the ANALYSIS segment.

$ENDDATA Byte-offset to the end of the DATA segment.

$ENDSTEXT Byte-offset to the end of a supplemental TEXT segment.

$MODE Data mode (list mode, histogram).

$NEXTDATA Byte offset to next data set in the file.

$PAR Number of parameters in an event.

$PnB Number of bits reserved for parameter number n.

$PnE Amplification type for parameter n.

$PnR Range for parameter number n.

$TOT Total number of events in the data set.

3.2.19 The optional FCS TEXT segment keywords are as follows:

$ABRT Events lost due to data acquisition electronic coincidence.

$BTIM Clock time at beginning of data acquisition.

$CELLS Description of objects measured.

$COM Comment.

$COMP Fluorescence compensation matrix.

$CSMODE Cell subset mode, number of subsets to which an object may belong.

$CSVBITS Number of bits used to encode a cell subset identifier.

$CSVnFLAG The bit set as a flag for subset n.

$CYT Type of flow cytometer.

$CYTSN Flow cytometer serial number.

$DATE Date of data set acquisition.

$ETIM Clock time at end of data acquisition.

$EXP Name of investigator initiating the experiment.

$FIL Name of the data file containing the data set.

$GATE Number of gating parameters.

$GATING Specifies region combinations used for gating.

$GnE Amplification type for gating parameter number n.

$GnF Optical filter used for gating parameter number n.

$GnN Name of gating parameter number n.

$GnP Percent of emitted light collected by gating parameter n.

$GnR Range of gating parameter n.

$GnS Name used for gating parameter n.

$GnT Detector type for gating parameter n.

$GnV Detector voltage for gating parameter n.

$INST Institution at which data acquired.

$LOST Number of events lost due to computer busy.

$OP Name of flow cytometry operator.

$Pkn Peak channel number of univariate histogram for parameter n.

$PKNn Count in peak channel of univariate histogram for parameter n.

$PnF Name of optical filter for parameter n.

$PnG Amplifier gain used for acquisition of parameter n.

$PnL Excitation wavelength for parameter n.

$PnN Short name for parameter n.

$PnO Excitation power for parameter n.

$PnP Percent of emitted light collected by parameter n.

$PnS Name used for parameter n.

$PnT Detector type for parameter n.

$PnV Detector voltage for parameter n.

$PROJ Name of the experiment project.

$RnI Gating region for parameter number n.

$RnW Window settings for gating region n.

$SMNO Specimen (tube or well) label.

$SRC Source of the specimen (patient name, cell types)

$SYS Type of computer and its operating system.

$TIMESTEP Time step for time parameter.

$TR Trigger parameter and its threshold.

$UNICODE UNICODE code page for string type keyword values.

3.2.20 Alphabetical listing and detailed description of keywords. For all the
keywords below 'n', 'n1', 'n2', etc represent ASCII-encoded integer values.
The character 'f' represents an ASCII-encoded floating point number. The word
"string" represents an ASCII or UNICODE-encoded TEXT string that can be of
any length greater than zero. If the optional $UNICODE keyword is used, a
specified subset of the strings may be represented with two byte characters
in a variety of UNICODE conformant languages. Otherwise strings are in single
byte ASCII. The character 'c' represents a single ASCII-encoded character.
The '/' character is used here as the delimiter for illustrative purposes.

$ABRT/n/ $ABRT/1265/

Number of events lost due to data acquisition electronic coincidence effects.
The number of aborted events here was 1265.

$BEGINANALYSIS/n/ $BEGINANALYSIS/123456789/ [REQUIRED]

This field contains the byte-offset from the beginning of the data set to the
beginning of the optional ANALYSIS segment. If there is no ANALYSIS segment,
a '0' should be placed in this keyword value. In this example, the ANALYSIS
segment begins at byte 123,456,789.

$BEGINDATA/n/ $BEGINDATA/123456789/ [REQUIRED]

This field contains the byte-offset from the beginning of the data set to the
beginning of the DATA segment. If the DATA segment is completely contained
within the first 99,999,999 bytes of the data set, this value duplicates the
offset contained in the HEADER segment. In this example, the DATA segment
begins at byte 123,456,789

$BEGINSTEXT/n/ $BEGINSTEXT/123456789/ [REQUIRED]

This field contains the byte-offset from the beginning of the data set to the
beginning of the supplemental TEXT segment. If there is no supplemental TEXT
segment, the value should be set to '0'. In this example, the supplemental
TEXT segment begins at byte 123,456,789.

$BTIM/hh:mm:ss[:tt]/ $BTIM/14:22:10:47/

Clock time at the beginning of data acquisition. The format of the value is
24-hour clock hours:minutes:seconds:number of fractional seconds in units of
1/60 of a second. The fractional seconds [:tt] is optional. Data acquisition
began at 14 hours, 22 minutes, 10 seconds, and 47/60th of a second.

$BYTEORD/n1,n2,n3,n4/ $BYTEORD/4,3,2,1/ [REQUIRED]

This keyword specifies the order from numerically least significant[1] to
numerically most significant[4] in which four binary data bytes are written
to compose a 32-bit word in the data acquisition computer. The numbers are
separated by commas (ASCII 44). In VAX computers and personal computers of
the IBM PC type, the byte order is 1,2,3,4 with the least significant byte
written first. In Hewlett Packard, Macintosh and Sun computers, the byte
order is 4,3,2,1 meaning that the least significant byte is written last. In
PDP-11 computers the byte order is 3,4,1,2 meaning that in the two 16-bit
words comprising a 32-bit word, the most significant 16-bit word is written
first. Within the 16-bit word, however, the least significant byte is written
first, which is the same as for a PC. Byte order is discussed more fully in
reference 4. In this example, the most significant byte is written first and
the least significant byte is written last. Use of this keyword enables
collection of data on one computer type and analysis of the data on another
computer type.

$CELLS/string/ $CELLS/Normal human peripheral blood/

Type of cells or other objects measured. This specimen is normal human peripheral blood.

$COM/string/ $COM/Incubation time was 47 minutes./

This keyword is used to attach a comment to the data set. It should not to be
used as a substitute for other standard keywords. This example shows the use
of $COM to add a brief note to the data set, a note that otherwise might
appear only in a laboratory notebook.

$COMP/n,f1,f2,f3,.../ $COMP/3,0.0,-0.1,0.0,-40.0,0.0,-0.6,0.0,-36.4,0.0/

This keyword enables the efficient storage of a fluorescence compensation
matrix. The matrix has n rows and n columns where n represents the number of
acquisition parameters. f1, f2,

f3, ... are floating point numbers representing the matrix elements. Both
positive and negative values are allowed. A positive or unsigned value
indicates that compensation has been additive while a negative value
indicates the more common case of subtractive compensation. The elements are
stored in row-major order, i.e., the elements in the first row appear first.
The matrix element Cij is the percentage of FLj that has been subtracted
electronically from FLi. In the example, the compensation matrix is 3 x 3 and
the matrix elements have the following subtractive values: C11=0.0%, C12 =
0.1%, C13 = 0.0%, C21 = 40.0%, C22=0.0%, C23 = 0.6%, C31 = 0.0%, and C32 =
36.4%, C33 = 0.0%.

$CSMODE/n/ $CSMODE/3/

Cell subset mode, i.e., the number "n" of subsets to which a object may
belong. The simplest case is that the cell subset parameter encodes a single
value per object as would be indicated by n = 1. If the value of n is greater
than 1 it indicates that the value of the cell subset parameter may encode n
subset identifiers. In these cases, the $CSVBITS and $CSVnFLAG keyword values
will specify how the cell subset values are encoded. It should be noted that
regardless of the value for this keyword, a cell subset value of zero
indicates that the object is undefined by the analysis scheme that was used.

$CSVBITS/n/ $CSVBITS/4/

The number of bits used to encode a cell subset value. When the $CSMODE
keyword value is greater than 1, the number of bits used to encode a cell
subset identifier must be specified by the $CSVBITS keyword value. In the
cited example, 4 bits, i.e., values of 0-15, are used to encode cell subset
identifiers. See the discussion of the ANALYSIS segment in section 3.4.

$CSVnFLAG $CSV1FLAG/4096/

The value used as a "flag" to indicate that the "n" identifier field encodes
a value. In the cited example, if bit 13 is set in the value of the cell
subset parameter (parameter value AND 8192 is TRUE), one should read the
second field of bits to decode the value. It is not necessary to set "flags",
but if one wishes to use zero to encode the first subset for any field, one
must set a "flag" to indicate that the zero in that field refers to a subset.
See the discussion of the ANALYSIS segment in section 3.4.

$CYT/string/ $CYT/FACScan/

The name of the flow cytometer used for the data set. Here a FACScan was used.

$CYTSN/string/ $CYTSN/400E370/

The serial number of the flow cytometer used for the data set. Here the
serial number is 400E370.

$DATATYPE/c/ $DATATYPE/I/ [REQUIRED]

This keyword describes the type of data written in the DATA segment of the
data set. The four allowed values are 'I', 'F', 'D', or 'A'. The DATA segment
is a continuous bit stream with no delimiters. 'I' stands for unsigned binary
integer, F stands for single precision IEEE floating point, 'D' stands for
double precision IEEE floating point, and 'A' stands for ASCII. The
additional keywords $PnB (bits per parameter) and $PnR (range per parameter)
are needed to completely describe an event in the DATA segment.

$DATATYPE/I/ means that the events are written as unsigned binary integers.
For each parameter in an event, both the maximum length in bits allocated for
storage of the parameter and the actual integer range used by the parameter
within that allocation are needed. The number of bits per parameter is
specified by $PnB. For example, $P1B/16/ specifies that 16 bits are allocated
for parameter 1. $P1R/1024/ specifies that parameter 1 values range from 0 to
1023. This allows the data word length to be specified, facilitating
compatibility between machines with different data word lengths and enabling
bit compression of the data.

$DATATYPE/F/ means that the data are written as single precision floating
point values in the IEEE standard format. Note that the $PnB keywords should
be set to a value of 32 for each parameter in an event. For example,
$P1B/32/.

$DATATYPE/D/ means that the data are written as double precision floating
point values in the IEEE standard format. The $PnB keyword should be set to a
value of 64 for each parameter in an event. For example, $P3B/64/ says that
parameter 3 is allocated 64 bits of storage space. The IEEE standard formats
for single- and double-precision numbers are given in the table below:

Single-precision Double-precision

Sign bit 31 bit 63

Exponent bits 30-23 bits 62-52

bias 127 bias 1023

Fraction bits 22-0 bits 51-0

Range 3.402823e+38 1.797693e+308

approx. 1.175494e-38 2.225074e-308

$DATATYPE/A/ means that the data are written as ASCII-encoded integer values.
In this case, the keyword$PnB specifies the number of bytes allocated per
value (one byte per character). This represents fixed format ASCII data.
$P1B/4/ indicates that the maximum value for parameter 1 would be 9999. Data
are stored in a continuous byte stream, with no delimiters. If the value of
the $PnB keyword is the * character, e.g., $P1B/*/, the data are free format
and number of characters per parameter value may vary. In this case, all
values are separated by one of the following delimiters: "space", "tab",
"comma", "carriage return", or "line feed" characters. Note that multiple,
consecutive delimiters are treated as a single delimiter. Since there are
significant differences between the way in which consecutive delimiters are
treated by different programming languages, care should be taken when using
this format. Zero values must be explicitly specified by the zero (0)
character. Thus, the string "1,3,, ,3" (note the space between the third and
fourth commas) would only specify three values. It would be treated as
between 3 and 5 values by different programming languages.

$DATE/dd-mmm-yyyy/ $DATE/01-OCT-1994/

This keyword specifies the date on which the data set was created. The format
is day-month-year with the number of characters specified by dd-mmm-yyyy.
This data set was created on 01 October 1994. Note that the all the character
positions should be filled including leading zeros. Accepted abbreviations
for the months are: JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV,
DEC.

$ENDANALYSIS/n/ $ENDANALYSIS/123456789/ [REQUIRED]

This field contains the byte-offset from the beginning of the data set to the
end of the ANALYSIS segment. If there is no ANALYSIS segment, a '0' should be
placed in this keyword value. In this example, the ANALYSIS segment ends at
byte 123,456,789

$ENDDATA/n/ $ENDDATA/123456789/ [REQUIRED]

This field contains the byte-offset from the beginning of the data set to the
end of the DATA segment. If the DATA segment is completely contained in the
first 99,999,999 bytes of the data set, this value duplicates the offset
contained in the HEADER segment. In this example, the DATA segment ends at
byte 123,456,789.

$ENDSTEXT/n/ $ENDSTEXT/123456789/ [REQUIRED]

This field contains the byte-offset from the beginning of the data set to the
end of the supplemental TEXT segment. If there is no supplemental TEXT
segment, the value should be set to '0'. In this example, the supplemental
TEXT segment ends at byte 123,456,789.

$ETIM/hh:mm:ss[:tt]/ $ETIM/14:22:10:47/

Clock time at the end of data acquisition. The format of the value is 24-hour
clock hours:minutes:seconds:number of fractional seconds in units of 1/60 of
a second. Data acquisition ended at 14 hours, 22 minutes, 10 seconds, and
47/60th of a second. The fractional seconds keyword value is optional as
indicated by the square brackets.

$EXP/string/ $EXP/A. Smith/

The name of the person initiating the experiment. This experiment was under
the direction of A. Smith.

$FIL/string/ $FIL/071494.001/

The name of the data file that corresponds to this data set. If there is only
one data set in the FCS file, then this file name should be the same as the
name of the FCS file. If this data set is one of several in the FCS file,
then the file name may correspond to a data file collected at some earlier
time. In this example, the data are stored in a file named 071494.001.

$GATE/n/ $GATE/2/

This keyword specifies the number of parameters used for gating. It is
analogous to the $PAR keyword, which specifies the total number of parameters
for each event in the data set. In this example, there are two gating
parameters. The current practice in many flow cytometry laboratories is that
the gating parameters are collected as part of the data set. This fact is
reflected in the redefinition of the $RnI keyword described below.

$GATING/string/ $GATING/R1/ $GATING/R1 AND (R2.0R.R3)/

This keyword specifies the conditions under which the data in the data set
have been acquired. The conditions are set through Boolean operations among
regions defined below using the $RnI and $RnW keywords. Allowed Boolean
operators are AND, OR(inclusive), and NOT. The operands are the regions (Rn).
Operators are separated from operands or other operators by spaces or
periods. Operator precedence is from left to right unless overridden with
parentheses. In the first example, data were collected using gating region
R1. Events with parameter values falling outside R1 were excluded from the
data set. In the second example, an event is included in the data set only if
the appropriate parameter value is inside R1 and is inside R2 or R3 or both.

$GnE/f1,f2/ $G3E/4.0,0.01/

This keyword specifies whether linear or logarithmic amplifiers were used for
gating parameter number n. When the amplification is logarithmic the value of
f1 specifies the number of logarithmic decades and f2 represents the linear
value that would have been obtained for a signal with a log value of 0. In
the example above, the data for parameter 3 were collected using a
four-decade logarithmic amplifier and the 0 channel represents the linear
value, 0.01. When linear amplification is used or when amplification is
undefined such as with some calculated parameters, f1 and f2 are set to 0.

$GnF/string/ $G2F/520LP/

This keyword specifies the optical filter that was used for the light
reaching the detector for gating parameter n. This example shows that the
optical filter used for the second gating parameter was a type 520 nm long
pass.

$GnN/string/ $G1N/FL2/

This keyword specifies a short name for gating parameter number n. Here "FL2"
is the name for gating parameter 1. Required short names for parameters
include the following:

FS Forward Scatter

SS Side scatter

FLn Fluorescence channel n

AE Axial Extinction

CV Coulter Volume

TIME Time

$GnP/n1/ $P3P/27/

The amount of light collected by the detector for gating parameter number n1
expressed as a percentage of the light emitted by a fluorescent object. In
the example, 27% of the emitted light was captured by the detector for gating
parameter number 3.

$GnR/n1/ $G2R/1024/

This keyword specifies the range, n1, of gating parameter n. In this example,
the events for gating parameter 2 range from 0 to 1023.

$GnS/string/ $G1S/FITC-CD45/

This keyword specifies a longer name for gating parameter n than is allowed
by $GnN. Here, FITC-labeled CD45 is the name for gating parameter 1.

$GnT/string/ $G2T/PMT9524/

This keyword specifies the detector type for gating parameter n. Here, gating
parameter 2 uses a photomultiplier tube (PMT) of type 9524.

$GnV/n1/ $G2V/645/

This keyword specifies the detector bias voltage for gating parameter n. In
this example, the detector for gating parameter 2 is biased at 645 volts.

$INST/string/ $INST/Laboratory of FCM, RPCI/

The institution or laboratory in which the data were collected. In this
example, the data were collected in the Laboratory of Flow Cytometry at
Roswell Park Cancer Institute.

$LOST/n/ $LOST/457/

This keyword specifies the number of events lost during data acquisition
because the computer was busy with other tasks. Here, 457 events were so
lost.

$MODE/c/ $MODE/L/ [REQUIRED]

This keyword specifies the mode in which the data were acquired. Allowed
values for the character c are 'C', 'L', or 'U'. These options are described
as follows:

C One correlated multivariate histogram is stored in the data set as a
multidimensional array. There can be only one such histogram per data set. In
storing multiparameter correlated data, the index for the first parameter is
incremented first, then the second, etc. For bivariate data, the first data
value corresponds to index 1 for parameter 1 and index 1 for parameter 2, the
second data value corresponds to index 2 for parameter 1 and index 1 for
parameter 2, etc.

L List mode. For each event, the value of each parameter is stored in the
order in which the parameters are described. The number of bits reserved for
parameter 1 is described using the $P1B keyword. There can be only one set of
list mode data per data set. The $DATATYPE keyword describes the data format.
This is the most versatile mode for the storage of flow cytometry data
because mode C and mode U data can be created from mode L data.

U Uncorrelated univariate histograms. There can be more than one univariate
histogram per data set. The histogram frequencies for parameter 1 are stored
first followed by those for parameter 2, etc. If the univariate histograms
have been gated, they must all have been acquired with the same gates so that
the total number of events in each histogram is the same.

$NEXTDATA/n/ $NEXTDATA/202512/ [REQUIRED]

When there is more than one data set in an FCS file, this keyword gives the
byte offset from the beginning of a data set to the first byte in the HEADER
of the next data set in the FCS file. If n is zero (0), this is the final or
only data set in the file. This example shows that the next data set begins
at byte 202512 from the beginning of the present data set. Each data set
stands alone and must contain a full complement of keywords.

$OP/string/ $OP/Dave/

The name of the operator of the flow cytometer. Here Dave was the operator of
this instrument.

$PAR/n/ $PAR/5/ [REQUIRED]

This keyword specifies the total number of parameters stored in each event in
the data set. In this example, data for five parameters are stored for each
event.

$Pkn/n1/ $PK2/374/

For a univariate histogram of parameter n, this keyword specifies the channel
number, n1, containing the highest frequency of events. In this example, the
peak in the univariate histogram for parameter 2 is located in channel 374.
The $PKNn keyword specifies the count in that channel.

$PKNn/n1/ $PKN2/12803/

For a univariate histogram of parameter n, this keyword specifies the number
of events, n1, in the channel number (histogram bin) containing the maximum
event frequency. In this example, the univariate histogram for parameter 2
has a maximum event frequency of 12803. The $PKn keyword above specifies that
this peak count occurs at channel 374.

$PnB/n1/ $P3B/16/ [REQUIRED]

For $DATATYPE/I/(binary integers), this keyword specifies the number of bits
allocated, n1, for storage of parameter n. In this example, the data value
for parameter 3 would be stored as two bytes (16 bits). This keyword is used
in conjunction with $PnR to determine how the data are actually stored. A
flow cytometer with 10-bit analog-to-digital converters (ADCs) would have
$PnR/1024/. A 10-bit number would be stored in the 16-bit space allocated by
$PnB/16/ leaving 6 empty bits per parameter. These keywords enable tight bit
packing of events. For example, the data storage could be specified by
$PnB/10/$PnR/1024/ for each of the n parameters in an event. Then fewer bits
would be wasted in storing each event. However, packing these data for
storage and unpacking them later for analysis is very time-consuming. In
practice, most flow cytometers use $PnB/16/$PnR/1024/ for 10-bit data. A flow
cytometer with 8-bit ADCs would use $PnB/8/$PnR/256/ where n represents
integers from one to the number of parameters measured.

For $DATATYPE/A/(ASCII-encoded integers), $PnB specifies the number of
characters, n, per measured value for parameter n.

$PnE/f1,f2/ $P3E/4.0,0.01/ [REQUIRED]

This keyword specifies whether linear or logarithmic amplifiers were used for
parameter number n. When the amplification is logarithmic the value of f1
specifies the number of logarithmic decades and f2 represents the linear
value that would have been obtained for a signal with a log value of 0. In
the example above, the data for parameter 3 were collected using a
four-decade logarithmic amplifier and the 0 channel represents the linear
value, 0.01. When linear amplification is used or when amplification is
undefined such as with some calculated parameters, f1 and f2 are set to 0.

$PnF/string/ $P2F/520LP/

This keyword specifies the optical filter that was used for the light
reaching the detector for parameter n. This example shows that the optical
filter used for the second parameter was a type 520 nm long pass.

$PnG/f/ $P2G/10.0/

This keyword specifies the gain that was used to amplify the signal for
parameter n. This example shows that parameter 2 was amplified 10.0-fold
before digitization.

$PnL/n1/ $P1L/488/

This keyword specifies the excitation wavelength, n1, in nm for parameter n.
In this example, the wavelength was 488 nm for parameter number 1.

$PnN/string/ $P3N/FL1/

This keyword is used to specify the short name of parameter n. Here parameter
3 has a short name of FL1. Required short names for parameters include the
following:

CS Cell subset

FS Forward Scatter

SS Side scatter

FLn Fluorescence channel n

AE Axial Extinction

CV Coulter Volume

TIME Time

$PnO/n1/ $P2O/200/

This keyword specifies the excitation power, n1, in milliwatts for the light
source associated with the measurements for parameter n. Here 200 mW was used
to produce the signal associated with parameter 2.

$PnP/n1/ $P4P/50/

The amount of light collected by the detector for parameter number n
expressed as a percentage of the light emitted by a fluorescent object. In
the example, 50% of the emitted light was captured by the detector for
parameter number 4.

$PnR/n1/ $P2R/1024/ [REQUIRED]

This keyword specifies the maximum range, n1, of parameter n. For $MODE/L/
(list mode data), this corresponds to the ADC range, here 1024. The data
values can range from 0 to 1023. For univariate histogram data ($MODE/C/ or
$MODE/U/), it is the number of channels, n1, in the histogram for parameter
n. Here the histogram channel numbers range from 0 to 1023.

$PnS/string/ $PnS/CD45 FITC Fluorescence/

This keyword specifies a long name to be used as an axis label in a plot of
parameter n. Here FITC-labeled CD45 is the label. $PnS is the long name
equivalent of $PnN.

$PnT/string/ $P2T/PMT9524/

This keyword specifies the detector type for parameter n. Here, parameter 2
uses a photomultiplier tube (PMT) of type 9524.

$PnV/n1/ $P2V/645/

This keyword specifies the detector bias voltage, n1, in Volts for parameter
n. In this example, the detector for parameter 2 is biased at 645 Volts.

$PROJ/string/ $PROJ/AML patient study/

This keyword provides the name of the project. Here it is an AML patient study.

$RnI/string1,[string2]/ $R3I/P2,P4/ $R2I/G3/

This keyword associates a gating region number, n, with one or two
parameters, here shown as string1 and string2. The two strings are of the
form "Pn" or "Gn". "Pn" stands for collected parameter n, while "Gn" stands
for gating parameter n. In the first example, gating region 3 is associated
with a bivariate dot plot or bivariate histogram for parameters 2 and 4. The
$RnW keyword described below specifies the shape of the gating region. In the
second example, gating region 2 is associated with gating parameter 3. See
the discussion for the $GATE keyword.

$RnW/n1, n2[;n3, n4;...]/ $R1W/345, 366/

This keyword specifies the window settings for gating region n. This window
setting is useful only if the$RnI keyword is also specified. If the keyword
$RnI has only a single value, then n1 and n2 specify the inclusive lower and
upper bounds for the window in a univariate histogram. For example,
$R2I/3/$R2W/345,366/ specifies that gating region 2 is associated with gating
parameter 3. The gated events must range between channels 345 and 366
inclusive. If the $RnI keyword value has two values, then the window exists
in a bivariate plot and it is specified in the $RnW keyword as a polygon. The
x and y coordinates of the first point in the polygon are the pair n1, n2.
The next point is separated from the first by a ';' character and is
represented as n3, n4 above. The polygon can contain any number of points
separated by semicolons. The first point and the last point are assumed to be
connected. For example,
$R1I/2,3/$R1W/310,205;515,304;480,615;240,514;354,542/specifies that region 1
is defined in parameter 2 and 3 and that the region 1 window is a 5-sided
polygon in this 2-parameter space. The $GATING keyword will specify the way
the windows will be used (AND, OR, etc.).

$SMNO/string/ $SMN0/A7/

This keyword specifies the specimen number, which could be a tube or well
number. Here the specimen number is A7.

$SRC/string/ $SRC/J. Doe, HIV positive patient/

This keyword specifies the source of the specimen. Note that this keyword
value could contain patient information, which is protected by the U.S.
Privacy Act and by strict U.S. National Institutes of Health guidelines. The
acquiring laboratory may choose to use encoded information for this keyword
value.

$SYS/string/ $SYS/Macintosh System 7.5/

This keyword specifies the type of computer and the operating system under
which the data set was collected. Here the data set was collected on a
Macintosh running System 7.5.

$TIMESTEP/f/ $TIMESTEP/0.0167/ $TIMESTEP/1.0/

The presence of this keyword indicates that time has been collected as one of
the parameters in the data set. $PnN/TIME/ specifies which parameter
represents time. $TIMESTEP specifies the time step in seconds. In the first
example, the time step is 0.0167 seconds, which is 1/60 of a second and is
the typical clock tick on a personal computer. For this example, an
implementor specifies $P6N/TIME/$P6B/16/ $P6R/65536/$TIMESTEP/0.0167/. When
the first event in the data set is captured by the computer, the number of
clock ticks since the computer was turned on is read and saved as a constant,
n Ticks. A zero value is entered into parameter 6 in this first event. When
the second event arrives, the number of clock ticks is obtained from the
computer clock. n Ticks is subtracted from this number and the result stored
as parameter 6 of the second event. The actual number of seconds between any
subsequent event and the first event is obtained by multiplying the parameter
6 value by the $TIMESTEP value. In this example, the maximum time range is
approximately 17.5 minutes. In the second example, an implementor specifies
$P6N/TIME/$P6B/16/$P6R/65536/$TIMESTEP/1.0/. Using the same procedure as in
the first example, any events arriving less than 1.0 second after the first
event have a parameter value of zero, while those arriving between 1.0 second
and (less than) 2.0 seconds have a parameter 6 value of 1. The maximum time
range is approximately 18 hours. If an external constant time interval
generator is used to provide a signal input that increases linearly with
time, the appropriate TEXT keywords might be $P6N/TIME/$P6B/16/$P6R/1024/

$TIMESTEP/0.001/. Here the time step is smaller than that available from the
computer clock. However, the number of steps is limited by the range of the
ADC, here 10 bits. The maximum time range for this example is 1023 seconds.

$TOT/n/ $TOT/25000/ [REQUIRED]

This keyword specifies the total number of events in the data set. This data
set contains 25000 events.

$TR/string, n/ $TR/FS,54/

This keyword specifies the parameter name which serves as the trigger signal
for an event. The number, n, is the channel number of the threshold
signifying an event. When the threshold is exceeded, an event is declared.
Here forward scatter (FS) is the trigger signal and the event threshold is at
channel 54.

$UNICODE/n,string1,string2,etc/ $UNICODE/3,$SYS,$SRC/

The integer 'n' represents the UNICODE page number used and the comma
delimited strings represent the keyword values where UNICODE text is used.
UNICODE is an international standard that enables computer representation of
most of the world's languages. The characters for each language are
represented as two-byte codes on a code page. There are 65536 codes
available. U.S. ASCII requires 256 two-byte characters. For computer systems
that support UNICODE, implementors will be able to present axis labels and
other appropriate text strings in the language of the country in which the
flow cytometry data are being collected. If this keyword is not present,
single byte U.S. ASCII is used for all strings. In the example above, UNICODE
page 3 was used to write the values for the $SYS and $SRC keywords.

3.3 DATA Segment

The DATA segment contains the raw data in one of three modes (list,
correlated or uncorrelated) described in the primary TEXT segment by the
$MODE keyword value. The data are written to the DATA segment in one of four
allowed formats (binary, floating point, double precision floating point or
ASCII) described by the $DATATYPE keyword value. The most common form of data
storage is list mode storage in the form of binary integers ($DATATYPE/I/
$MODE/L/). The $PnB set of keywords specify the bit width for the storage of
each parameter. The $PnR set of keywords specify the channel number range for
each parameter. For example, $P1B/16/ $P1R/1024/ specifies a 16-bit field for
parameter 1 and a range for the values of parameter 1 from 0 to 1023, which
corresponds to 10 bits. Implementors should use a bit mask when reading these
list mode parameter values to insure that erroneous values are not read from
the 4 unused bits.

3.4 ANALYSIS segment

ANALYSIS is an optional segment that, when present, contains the results of
data processing. It is often the case that analysis is performed off-line,
after the data has been collected and stored in a data set. Therefore, the
ANALYSIS segment typically contains information added to a copy of the
original file. For examples, the results of cell cycle analysis or
immunophenotype determinations often involve more complex analyses than can
be performed in "real time" as the data is collected and stored. The ANALYSIS
segment has the same structure as the TEXT segment; i.e., it consists of a
series of keyword-value pairs. There are no required keywords for the
ANALYSIS segment. The optional FCS keywords are listed in 3.4.1 with one line
descriptions and in 3.4.2 with full descriptions and examples. Implementors
may add their own keywords.

A proposal has been made that the ANALYSIS segment be used for identifying
cell subsets, determined either by region drawing or by some partitioning
method such as cluster analysis (4). This may be particularly useful for
immunophenotyping data. Three approaches to identifying cell subsets are
discussed below. The first two use the least space in the data set but
require the cell subsets be disjoint. The third approach adds a parameter to
each event and supports overlapping cell subset assignments.

In method 1, the implementor uses the TEXT segment keyword-value pairs
$CSMODE/1/ and $CSTOT/n/to specify that there is one group of cell subsets
containing n disjoint subsets of cells. The TEXT segment keyword-value pair
$CSVBITS/8/ is used to indicate that the cell subset assignments for each
event are stored in a binary vector of unsigned characters (8 bits each)
whose length is the number of events in the data set. This vector is stored
in an other segment following the ANALYSIS segment. The DATA segment contains
a copy of the original data with the events written in the same order as in
the original data set. In the ANALYSIS segment, $CSnNUM is used to count the
number of cells in each of the n subsets.

In method 2, the implementor uses the TEXT segment keyword-value pairs
$CSMODE/1/ and $CSTOT/n/ as above but does not use the $CSVBITS keyword. In
the DATA segment, the events are written out one cell subset at time rather
than in the original event order. In the ANALYSIS segment, $CSnNUM is used to
count the number of cells in each of the n subsets. No other segment is
required.

Method 3 creates an additional cell subset (CS) parameter for each event in
the data set. Cell subsets may be defined by the method, e.g., cluster
analysis, neural network, boolean gates on combinations of parameters,
hyperplanes in n-dimensional space, etc. The value of the parameter may
encode a single subset identifier number for each event ($CSMODE/1/) or more
than one identifier number per event (value of $CSMODE greater than 1). The
meanings of the identifier numbers are specified by the values of the
$CSnNAME keywords in the TEXT and ANALYSIS segments. If the value of the CS
parameter is 0 (zero), that event is unclassified by the definitions used to
assign cell subsets. If the classification scheme creates unique
non-overlapping populations, e.g., CD4 T cells, CD8 T cells, B cells,
monocytes/macrophages, neutrophils, etc., then the simplest approach is to
set the value of $CSMODE to "1" and use 1==CD4 T cell, 2 == CD8 T cells, etc.
In some situations, it may be useful to be able to assign a single cell to
more than one defined subset. For example, to extend the preceding example,
subset identifiers 1 - 5 would correspond the definitions listed above with 6
== lymphocytes and 7 == mononuclear cells. This scheme would require
$CSMODE/3/since a single cell could belong to three defined subsets.
Operationally, assuming that an event in the data set is a CD4 T cell, then
the first bit field would encode a value of 1 (CD4 T cell), the second bit
field would encode a value of 6 (lymphocyte), and the third bit field would
encode a value of 7 (mononuclear cell). The bit fields and their
interpretations in these cases would be defined by the values of the $CSVBITS
and the $CSVnFLAG keywords as outlined in the reference (4). Method 3 also
supports the creation of an ANALYSIS segment that includes a summary for the
results written as the values for the keywords pertaining to the numbers of
cells in each subset, etc. Method 3 has the size "cost" of an additional
parameter, but it permits one to include a complete and explicit record of an
analysis as an integral part of a data set.

3.4.1 Optional FCS ANALYSIS segment keyword list:

$CSDATE Cell subset analysis date.

$CSDEFFILE Cell subset definition file name.

$CSEXP Name of person who performed the cell subset analysis.

$CSnName Name of cell subset number n.

$CSnNUM Number of cells in cell subset number n.

3.4.2 Optional FCS ANALYSIS segment keywords:

$CSDATE/dd-mmm-yyyy/ $CSDATE/26-OCT-94/

Cell subset date. This keyword specifies the date on which the data set
containing the cell subset analysis was created. The format is of the date is
the same as that for$DATE. This data set was created on 26 October 1994.

$CSDEFFILE/string/ $CSDEFFILE/c:\filename.dat/

Cell subset definition file. The string is the name of the file containing
the information needed to define each of the cell subsets. In the example the
cell subset definition file is named filename.dat and is located on drive c:.

$CSEXP/string/ $CSEXP/A. Smith/

Cell subset experimenter. Name of the person who performed the cell subset
analysis. Here, A. Smith performed the cell subset analysis.

$CSnName/string/ $CS2N/lymphocytes/

Cell subset name. This is a string naming cell subset number n. In the
example, cell subset 2 is named "lymphocytes".

$CSnNUM/n1/ $CS2NUM/3456/

This keyword specifies the number of cells, n1, in cell subset number n. In
the example, cell subset 2 contains 3456 cells.

3.5 CRC Value

The CRC word is computed for the part of each data set beginning with the
first byte of the HEADER segment and ending with the last byte of the final
segment of the data set (which could be a TEXT, DATA, ANALYSIS or OTHER
segment). The CRC word is a 16-bit cyclic redundancy check value (5). This
16-bit CRC word conforms to the CCITT standard (Comite' Consultatif
International Te'le'graphique et Te'le'phonique). This standard uses the
CCITT polynomial X16 + X12 + X5 and requires that each input character be
interpreted as its bit-reversed image. These requirements are satisfied by
the icrc function in reference 6 if the last two function arguments are 0 and
-1, respectively. The CRC value will be placed as ASCII in the 8 bytes
immediately after the end data set. If an implementor chooses not to compute
and store a CRC word then the 8 bytes immediately after the end of the data
set should be filled with ASCII '0' characters.

3.6 Other Segments

Implementors may create any number of OTHER segments as they choose.

4. References

1. Murphy RF, Chused TM:A proposal for a flow cytometric data file standard.
Cytometry 5:553-555, 1984.

2. Dean PN, Bagwell CB, Lindmo T, Murphy RF and Salzman GC: Data File
Standard for Flow Cytometry. Cytometry 11:323-332, 1990.

3. The Unicode Consortium: The UNICODE Standard, Version 1.0, vol. 1.
Addison-Wesley Publishing Co. Inc., Reading, MA, 1991.

4. Redelman D, Coder DM: Cell subset (CS) parameter to record the identities
of individual cells in flow cytometric data. Cytometry 18:95-102, 1994.

5. Press WH, Teukolsky SA, Vetterling WT, Flannery BP: Numerical Recipes in
C. 2nd ed. Cambridge University Press, Cambridge, UK, 1992.

5.1 Appendix A: Major Differences Between FCS2.0 and FCS3.0.

1) The HEADER has been modified to accommodate data sets longer than
99,999,999 bytes. Any offset value that requires more than 8-bytes is now
represented by placing a '0' in the HEADER for that value and its associated
"$BEGIN" value. The actual byte-offset value is then found in the primary
TEXT segment of the data set. This system allows the vast majority of data
files to be backwards compatible with analysis software designed for previous
FCS versions. However, a '0' byte-offset in the HEADER will prevent previous
FCS versions from reading very large data sets, avoiding read errors or
partial data reads. Note, $BEGINDATA, $ENDDATA, $BEGINANALYSIS, $ENDANALYSIS,
$BIGINSTEXT and $ENDSTEXT keyword-value pairs are required in the HEADER
segment of FCS3.0 conformant files irrespective of the size of the data set.
When the size of a data set remains below the 100 megabyte limit, the byte
offsets will be found both in the HEADER and in keyword value pairs in the
primary TEXT segment. When a data set reaches or exceeds 100 megabytes, byte
offsets will only be located in the primary TEXT segment.

2) A supplemental TEXT segment may now be included in a data set. The
supplemental TEXT segment may contain only optional keyword-value pairs and
may be located anywhere in a data set after the HEADER segment.

3) A primary TEXT segment must contain all required keyword-value pairs and
be located entirely within the first 99,999,999 bytes of a data set.

4) An optional 16-bit CRC check has been added to the end of each data set.
This internal check-word allows for data set integrity checks.

5) To enable third party or off-line analysis software to correctly read and
interpret data, the keyword $PnE is now required for each parameter. The $PnE
keyword describes the method of amplification used for a given parameter.

6) There are a number of new optional FCS TEXT Segment keywords. $CSMODE,
$CSTOT, $CSVBITS, $CSVnFLAG specify an added parameter to identify cell
subsets. $CYTSN specifies the cytometer serial number. $RnI has been
redefined. $TIMESTEP has been added to enable use of a time parameter.
$UNICODE enables the specification of certain keywords in languages not
representable with ASCII text.

7) The $DATE keyword value for year is increased by two bytes to -yyyy.

8) The following optional ANALYSIS segment keywords have been added: $CSDATE,
CSDEFFILE, $CSEXP, $CSnN, and $CSnNUM to enable specification of cell
subsets.

9) The definition of the $BYTEORD and $PnE keywords have been corrected and
clarified. The $PnG keyword has been added, describing the linear gain
applied to a signal.

10) The $COMP keyword has replaced $DFCiTOj for the description of
fluorescence compensation.

5.2 Appendix B: Data File Standards Committee of the International Society
for Analytical Cytology

Larry Seamer, Chair
Director, Flow Cytometry Facility
University of New Mexico
Cancer Center, Cytometry
900 Camino de Salud NE
Albuquerque, NM 87131
(505) 277-6206
lseamer@cobra.unm.edu

Bruce Bagwell
Maine Medical Center Research Institute
70 John Roberts Road, Suite 8
South Portland, ME 04106
75450.167@compuserve.com

Luther Barden
Div. of Computer Research and Technology,
Building 12A Room 2015
National Institutes of Health
9000 Rockville Pike
Bethesda, MD 20892
luther_barden@nih.gov

Marc Christofferson
Becton Dickinson Immunocytometry Systems
2350 Qume Drive
San Jose, California 95131-1807
(408) 954-2058
m_chr@BDIS.com

Louise E. Magruder
Division of Clinical Laboratory Devices
FDA/CDRH/ODE
72 Gaither Road
Rockville, MD 20850
lem@fdadr.cdrh.fda.gov

George Malachowski
Cytomation, Inc.
400 E. Horsetooth Rd.
Ft. Collins, CO
(303)226-2200

Robert F. Murphy
Associate Professor
Department of Biological Sciences and Center for
Light Microscope Imaging and Biotechnology
Carnegie Mellon University
4400 Fifth Avenue, Box 52
Pittsburgh, Pennsylvania 15213
(412) 268-3480
murphy+@cmu.edu

Doug Redelman
Sierra Cytometry
3150 Susileen Dr.
Reno, NV 89509

Gary C. Salzman
Life Sciences Division
Los Alamos National Laboratory
Mail Stop M888
Los Alamos, NM 87545
(505)667-5503
salzman@lanl.gov

James C.S. Wood
Coulter Corporation
Mail Code 52-A01
11800 S.W. 147th Avenue
Miami, FL 33196-2500
(305)380-2449 or 344-1290 (voice)
(305)344-5240 (FAX)
woodjcs@gate.net

____________________________________________________________
copyright 1996 International Society for Analytical Cytology

Next message: T. Vincent Shankey: "Re: ploidy analysis"
Previous message: L_DIAZ@rcmad.upr.clu.edu: "lysis sol."