How Alignment
Works
Alignment in Udicus refers to forcing the same start and end times
on recorded channels. This is useful for viewing files. Udicus offers
four different alignment options: align to longest, align to shortest,
align to channel and align to dates.
The 'align to longest’ option will find the earliest start time
and the latest end time among all the channels that have been imported,
and align the channels to those dates. ‘Align to shortest’
will find the latest start time and the earliest end time. ‘Align
to longest’ can be thought of as taking the union of all the times,
while ‘align to shortest’ is taking the intersection.
‘Align to channel’ can be used if a specific channels is
of higher significance, and the user wants all the other channels to
conform to it.
Finally there is ‘align to dates’, which takes the desired
start and end times from the user, and aligns the channels to those
times. Internally, 'align to dates' is the basis for all the three other
options. For example, ‘align to longest’ is actually a call
to 'align to dates', with the earliest start time and latest end time
as parameters. ‘Align to shortest’ works the same way. ‘Align
to channel’ takes the start and end times from the specified channel.
In order to make ‘align to longest’ and ‘align to shortest’
work with 'align to dates', we need functions to find the earliest and
latest start and end times. There are four functions which do these
jobs by walking through the list of channels and keeping track of the
earliest and latest start time, and earliest and latest end time. Once
the desired dates are found, we want to adjust it so that it matches
the channel with the highest sampling rate. This means that if, for
example, we find the latest start date to be 13:11:09.000, and the channel
with the highest sampling rate does not have a sample at that exact
time but has a sample at 13:11:09:250 then we would take 13:11:09.250
as our latest start date. We do this because channels with high sampling
rates are channels in which there is little tolerance for errors in
timing, so we want them to be exactly on time. The slower channels on
the other hand, are not affected as much by a small shift in time. A
motion sensor that samples every minute does not care if we shift its
data by one or two seconds. The find fastest function is called by the
functions that find the earliest/latest start/end times to find the
channel with the highest sampling rate.
Once we have determined the times we want to align to, we are ready
to start the first of the two steps of aligning the channels. This first
step consists of determining what needs to be done with the start and
end times of each channel. The four different cases we can have and
the actions we need to take for each are listed below.
Case |
Action |
Start time is before Desired time |
Cut from beginning |
Start time is after Desired time |
Pad beginning |
End time is before Desired time |
Pad end |
End time is after Desired time |
Cut from end |
Cutting refers to removing samples from the channel and padding is
adding samples. Added samples are marked invalid in the Udicus program
so that they don't get confused with real data. The key function of
step one is to determine whether to cut, pad or do nothing with each
of the ends and then to calculate the number of samples that need to
be cut or padded. Whether to cut or pad is determined by comparing the
start and end times of each channel with the desired times. To Calculate
how many samples need to be removed or added ‘align to dates’
calls the specific cut or pad functions to do the job. There are four
functions that do the cutting and padding:
align_temp_file_front_pad
align_temp_file_front_cut
align_temp_file_back_pad
align_temp_file_back_cut
The cut and pad functions work directly with the data in the temp files.
First they find the difference between the actual time and the desired
time. The difference divided by the sampling interval of that channel
gives the number of samples to be removed or added. Once the number
is determined, each function does the cutting or padding in its own
way because each case requires different treatment. Working with the
beginning of a channel involves re-writing the temp file because it
is not possible to remove or add samples at the front. The back end
of a channel is easier to work with, because we only need to extend
the file to add samples, and overwrite with zeros the samples we want
to remove. After the temp files have been fixed, the valid map for each
channel must be updated. This is done by copying the old valid map and
adding invalid markers where padding occurred or removing markers if
samples were cut. Finally, the number of samples is adjusted simply
by subtracting the number of samples cut and adding the number of samples
padded.
|