Changing your EDF implementation according to this list does not cause any incompatibility with EDF files or with software that abides to the official specs. Neither would you loose any of the original simplicity or flexibility. Some answers define EDF export more strictly than the official specs do. But EDF import (reader) software should accommodate all options that the official specs leave to the implementor. The list may give you an idea of these options.
EDF was designed in
one day and we
originally had in mind the exchange
of polygraphic recordings between mainly PC's in the old millennium. I
suggest that you also abide to the three simple red-color guidelines
(at Q3, Q7 and Q10), so your EDF can be used all over the world,
any machine and until the year 2084.
you want to use EDF also for the exchange of annotations, events
automatic or manual analysis results, then use
rather than EDF.
For text fields in the header,
what is the character set to use?
Export. EDF specs say that header information should be coded in ASCII strings. The American Standard Code for Information Interchange (ASCII) is 7 bits wide and consists of control characters (byte values 0..31 and 127, for instance for LineFeed, FormFeed, Carriage Return, Delete) and printable characters (32..126). So, unless you are looking for trouble, use only printable ASCII characters (32..126).
Import. Would an EDF file ask for trouble (that is, contain control characters), EDF readers should not try to execute these. Would an EDF file contain control characters or otherwise illegal characters (127..255), warn the producer of that file.
Is the correct syntax for the
date and time fields DD.MM.YY and
hh.mm.ss (D, M, Y, h, m, and s = [0..9]) as in "02.08.51"? I also saw
and " 2. 8.51".
Export. The official specs say "The information in the ASCII strings must be left-justified and filled out with spaces" and "8 ascii : startdate of recording (DD.MM.YY)" and "8 ascii : starttime of recording (hh.mm.ss)". The format does not specify that D, M, Y, h, m and s = [0..9]. Therefore, some may argue that a space or even a blank (null character, 0) is also allowed in the ASCII string. However, using spaces conflicts with the "left-justification" spec and the null character is a 'forbidden' ASCII control character (see Q1). So, my advice is to produce EDF date and time fields containing only characters 0..9 and the period (.) as a separator, for example "02.08.51".
Import. Still, EDF viewers should also accommodate " 2. 8.51" and "2.8.51". And it is probably wise (and not much work) to have them also accommodate different separators, like in 02:08-51 and 02/08'51.
How about the Y2K millennium
In fact, it is a centennial problem. An EDFdate of "02.08.51" in the "Startdate of Recording" field could specify a recording from 2051, 1951, 1851, 1751, etc. First, it is wise to put the full date in the "local recording identification" field (80 free ASCII's), for instance in the format "Startdate 02-AUG-1951". This also avoids any confusion between American and European date format.
Next, you can use 1985 as a clipping date. EDF was used for the first time in 1989. At that time, some older recordings from 1985 were also converted to EDF. No EDF was recorded before 1985. Therefore you can use 85 as a clipping date in your EDF software. Or in other words: if the EDFyear (yy=51 in the above example) is equal to or larger than 85, then the real startdate is assumed to be EDFdate + 1900. If the EDFyear is smaller than 85, the real date is assumed to be EDFdate + 2000. In other words, in the EDF startdate, yy=00-84 means yyyy=2000-2084 and yy=85-99 means yyyy=1985-1999.
This clipping date was discussed and adopted by the Siesta project in 1999 and is also in our viewer PolyMan.
Are the "digital minimum" and
"digital maximum" values hints
or strict limits?
The specs say "The digital minimum and maximum of each signal should specify the extreme values that can occur in the data records." Note the word "can". It is not necessary that these values actually DO occur. So take safe values that you know the signal will not exceed, for instance the range of the ADC. Note that "The physical (usually also physiological) minimum and maximum of this signal should correspond to these digital extremes". This correspondence is necessary for assessing gain and offset of the signal.
Why not always use -32767 for
"digital minimum" and +32767 for
Export. It is formally correct EDF as long as the purpose (specification of offset and amplification of the signal) is met with sufficient accuracy.
Which is the preferred method of
encoding a channel, where gain
= (physical maximum - physical minimum) /(digital maximum - digital
is negative? Using physical minimum > physical maximum or using
minimum > digital maximum?
Export. The specs say "The digital minimum and maximum of each signal should specify the extreme values that can occur in the data records. These often are the extreme output values of the A/D converter. The physical (usually also physiological) minimum and maximum of this signal should correspond to these digital extremes...". So, just reading this chronologically, first specify digital maximum > digital minimum, then derive the 'corresponding' physical minimum and physical maximum which in this case leads to physical minimum > physical maximum.
Import. Import routines should allow both alternatives because it is not much programming (just get gain and offset) and because someone else may have an interpretation different from mine.
Are "+22", ".5", "1E3" valid
syntax's of number fields?
Yes, as long as the numbers are left-justified in the ASCII strings and filled out with spaces. "22" and "-1.23E-4" are also OK. In the latter example, better accuracy can be obtained by using a standardized dimension prefix. So use "-123.456" and the dimension "uV " rather than "-1.23E-4" and the dimension "V ". In accordance with the examples in the original publication and in order to avoid Continental / (American) English confusions, never use a comma "," for a digit grouping symbol, nor for a decimal separator. When a decimal separator is required, use a dot (".") only.
How to specify signals that can
not be calibrated (like an oral-nasal
thermocouple for respiration flow, or an event button).
Export. Just set the physical dimension to some meaningless value like " ". Put appropriate values in the digital minimum/maximum fields and dummy values in physical minimum/maximum fields. Do not make physical minimum = physical maximum because that may result in 'division by zero' errors in programs, that compute the signal gain from these values.
Import. Some EDF files may not contain valid numbers in the digital/physical minimum/maximum fields, especially when signals were not calibrated. It should still be possible to read these signals, be they uncalibrated.
Do non-integer sampling
frequencies (like 1/30 Hz) cause problems?
Not necessarily. Good viewers will count samples and compare these with "number of samples in a datarecord" and in this way count how many datarecords have been passed (and consequently how many "duration's of a datarecord"). Because this is all integer computation, there are no round-off errors. This is why EDF recommends the "duration of a datarecord" to be an integer number of seconds. In the 1/30 Hz example, "duration of a datarecord" and "number of samples in a datarecord" can be 30 and 1, respectively. Or 3600 and 120, respectively.
However, if a sampling frequency is 999.98Hz (for instance due to small inaccuracy of the ADC clock), 'integer EDF' would use datarecords of 50000s containing 49999 samples of each signal. Even if only one signal is in the file, there would be more than 61440 bytes in a datarecord. The official specs say that in that case the duration should be a float value less than 1s. This will inevitably cause a small round-off error in the timing. Item 10 of the programming guidelines explains that this error is negligible, even in extreme cases.
Q10. Are the
2-byte samples in the data blocks
written in big or little endian?
Indeed, the byte order for the integer datasamples is different in (a.o.) Intel and Motorola processors. In the first EDF application, described in the original article, the Intel little endian byte order was applied (see section Results) because we had mainly PC's in mind. That is, the lower-significance byte was stored before (at lower address than) the higher-significance byte: the integer samples were stored "little-end-first". At present (March 1999) probably all EDF files in the world are in the little endian format and certainly all EDF viewers expect so. Let us keep it that way and ask the Motorola users to force the little endian in their routines. Some Sun users already did so in Matlab. So, EDF samples should be stored in the little endian format (the default format in PC applications).
Q11. What are common errors in EDF files?
Q12. What are common errors in EDF viewers?
Do the mentioned EDF-supporting
companies really provide correct EDF?
Not all companies provide perfect EDF. So, if you plan to buy EDF equipment, check its EDF files using the software at the downloads page. Or mail me a file and I will do a rough check (this offer is valid until further notice). Tell the supplier to correct any errors.
Q14. How to
encode free-text annotations?
Use EDF+ instead of EDF.
Q15. How to
encode events such as apneas, leg
movements and stimuli?
Use EDF+ instead of EDF.
store analysis results in EDF?
Any automatic or manual analysis result that is again a single or multi-channel timeseries (for instance a deltaplot together with an automatically scored hypnogram) can easily be stored in an EDF file. Some experience and discussions in the COMAC-BME and Siesta groups resulted in the following guidelines:
Q17. Should the
starttime of the recording
be in local time or for instance in Greenwich Mean Time?
Everybody until now (2000) uses local time, so I suggest that you do the same.
Q18. Are there
any standard texts for the EDF
We constructed some standard texts. EDF import (reader/browser, analysis) software should abide to the official specs and not depend on these standard texts. However, if the software detects that the imported file does contain standard texts, it can automatically recognize labels and dimensions. Using standard texts is not required for EDF compatibility. However, they reduce the probability for errors and avoid the need for user input in some types of automatic analysis programs. Therefore, it is wise to use the standard texts wherever possible.
Q19. Can EDF
The use of EDF+ and standard sleep staging annotations is recommended, but plain EDF can very well store hypnograms as well. Simply consider that a hypnogram is a single signal of 1 sample per 30s (or in some labs per 20s). For instance, all 1770 hypnograms made in the Siesta project are stored in an EDF file. The sleep stages W, 1, 2, 3, 4, R, MT and 'unscored' were coded in the EDF files as integer numbers 0, 1, 2, 3, 4, 5, 6 and 9, respectively. The EDF recording of an OSAS patient contains not only the polygraphic signals but also the hypnogram as one of the signals.
Q20. How to
encode other Neurophysiological
investigations such as EMG or Evoked Potentials?
Use EDF+ instead of EDF.