The UDCS File Format
The UDCS file format is meant as a compact and truly universal format
that can accommodate any type of channel-type data. If used properly,
it barely wastes a byte while giving a very large span of choices on
the precision/fidelity of a recording. It has minimal limitations to
it, both in its date format, and in its sample rate, size and count.
You should also read the UdcsChannel Class
documentation to read more about each variable mentioned here.
File Structure:
The file starts with 5 null-terminated character strings, which are:
the group name (or experiment name), the subject name (subject/animal/individual
name), the channel type (what was recorded), unit name (units, following
the UDCS Units Specification) and the original
format (if the recording originates from a different recording program,
this should contain its name). We suggest you use something like this
to read a null-terminated string:
char buff[] = {1, 0};
string ret = "";
while ( (fread(&buff, 1, 1, fd) == 1) && buff[0] != 0) r et
+= buff;
if (buff[0] != 0)
throw UdcsException(string("Error while reading file."));
return ret;
Next are the start and end time of the recording, noted in two 8 byte
integers each to note the year and the number of milliseconds into that
year. You can use the Date Class's read(FILE* fd) method to read in
these two dates. (see Date Class documentation).
After that come two 8 byte integers. The first notes the sample rate
(number of sample per 24 hours) and the second the sample count of the
data channel.
Then comes a single byte that should be read in as an unsigned char.
This contains a bitmap to determine the data type of the channel. (see
UdcsChannel Class documentation).
Finally comes the validity bitmap and the actual data values. The length
of the valid map can be determined with the size_of_valid_map method
of the Udcs class, or by doing the following:
unsigned int ret = (c->sample_count%8)?1:0;
ret += c->sample_count/8;
return ret;
This returns the top of sample_count/8. The valid bitmap should be
read in as an unsigned char array, and each bit in this array correspond
by index to each data value in the data value array. A true bit denotes
the data value as valid and a false bit marks it as invalid.
Now we have to read in an array of size sample_count * Udcs::size_of(data_type),
of type described in data_type.
Limitations:
UDCS does have some limitations, but compared to even a contemporary
supercomputer's capacities, these limits are several tens of orders
of magnitudes higher.
Date: maximum range is 1/1/0000 00:00:00:000 to 31:12:2^64 23:59:59:999,
resolution is millisecond.
Sample rate: maximum range is 1 samples/day to 2^64 samples/day, ie
1/(60*60*24) Hz to ~213.5 THz, resolution is samples/day.
Storage size: maximum range is 0 to 2^64 data points. (~8.77 trillion
years of recording at 200Hz, 584 years at 1GHz), resolution is 1 data
point.
Storage type: maximum range is 1 bit to 64 bits, unsigned and floats
are supported.
|