Parsing SR Research EyeLink Data#
What you will learn in this tutorial:#
how to parse raw eye tracking files created with SR Research EyeLink
how to extract experiment information using patterns
how to create a custom dataset definition to load a complete dataset of multiple files
Preparations#
We import pymovements as the alias pm for convenience.
import pymovements as pm
Let’s start by downloading a toy dataset ToyDatasetEyeLink that contains *.asc files:
dataset = pm.Dataset("ToyDatasetEyeLink", path='data/ToyDatasetEyeLink')
dataset.download()
INFO:pymovements.dataset.dataset:
You are downloading the pymovements Toy Dataset EyeLink. Please be aware that pymovements does not
host or distribute any dataset resources and only provides a convenient interface to
download the public dataset resources that were published by their respective authors.
Please cite the referenced publication if you intend to use the dataset in your research.
Downloading https://github.com/pymovements/pymovements-toy-dataset-eyelink/archive/refs/heads/main.zip to data/ToyDatasetEyeLink/downloads/pymovements-toy-dataset-eyelink.zip
Checking integrity of pymovements-toy-dataset-eyelink.zip
Extracting pymovements-toy-dataset-eyelink.zip to data/ToyDatasetEyeLink/raw
Extracting archive: 0%| | 0/4 [00:00<?, ?file/s]
Extracting archive: 100%|██████████| 4/4 [00:00<00:00, 112.64file/s]
-
DatasetDefinitionDatasetDefinition
-
NoneNone
-
NoneNone
-
NoneNone
-
NoneNone
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
TrueTrue
-
'EyeLink Portable Duo''EyeLink Portable Duo'
-
NoneNone
-
FalseFalse
-
1000.01000.0
-
'EyeLink''EyeLink'
-
NoneNone
-
-
1000.01000.0
-
ScreenScreen
-
6868
-
30.230.2
-
10241024
-
'upper left''upper left'
-
3838
-
12801280
-
15.59938648778295315.599386487782953
-
-15.599386487782953-15.599386487782953
-
12.50804441088254612.508044410882546
-
-12.508044410882546-12.508044410882546
-
-
-
NoneNone
-
dict (1 items)
-
str'subject_{subject_id:d}_session_{session_id:d}.asc'
-
-
dict (1 items)
-
dict (2 items)
-
<class 'int'><class 'int'>
-
<class 'int'><class 'int'>
-
-
-
TrueTrue
-
'pymovements Toy Dataset EyeLink''pymovements Toy Dataset EyeLink'
-
dict (0 items)
-
'ToyDatasetEyeLink''ToyDatasetEyeLink'
-
NoneNone
-
NoneNone
-
list (1 items)
-
ResourceDefinition
-
'gaze''gaze'
-
'pymovements-toy-dataset-eyelink.zip''pymovements-toy-dataset-eyelink.zip'
-
str'subject_{subject_id:d}_session_{session_id:d}.asc'
-
dict (2 items)
-
<class 'int'><class 'int'>
-
<class 'int'><class 'int'>
-
-
NoneNone
-
dict (4 items)
-
list (2 items)
- 'task'
- 'trial_id'
-
list (9 items)
-
'SYNCTIME_READING_SCREEN''SYNCTIME_READING_SCREEN'
-
'task''task'
- (1 more)
-
-
'SYNCTIME_JUDO''SYNCTIME_JUDO'
-
'task''task'
- (1 more)
-
- (7 more)
- (2 more)
-
-
'966c0b6aefe61f32942366ed719454d3''966c0b6aefe61f32942366ed719454d3'
-
NoneNone
-
str'https://github.com/pymovements/pymovements-toy-dataset-eyelink/archive/refs/heads/main.zip'
-
-
ResourceDefinition
-
NoneNone
-
NoneNone
-
NoneNone
-
NoneNone
-
-
tuple (0 items)
-
DataFrame (0 columns, 0 rows)shape: (0, 0)
-
list (0 items)
-
PosixPath('data/ToyDatasetEyeLink')PosixPath('data/ToyDatasetEyeLink')
-
DatasetPathsDatasetPaths
-
PosixPath('data/ToyDatasetEyeLink')PosixPath('data/ToyDatasetEyeLink')
-
PosixPath('data/ToyDatasetEyeLink/downloads')PosixPath('data/ToyDatasetEyeLink/downloads')
-
PosixPath('data/ToyDatasetEyeLink/events')PosixPath('data/ToyDatasetEyeLink/events')
-
PosixPathPosixPath('data/ToyDatasetEyeLink/precomputed_events')
-
PosixPathPosixPath('data/ToyDatasetEyeLink/precomputed_reading_measures')
-
PosixPath('data/ToyDatasetEyeLink/preprocessed')PosixPath('data/ToyDatasetEyeLink/preprocessed')
-
PosixPath('data/ToyDatasetEyeLink/raw')PosixPath('data/ToyDatasetEyeLink/raw')
-
PosixPath('data/ToyDatasetEyeLink')PosixPath('data/ToyDatasetEyeLink')
-
-
list (0 items)
-
list (0 items)
This dataset includes *.asc files that store raw eye-tracking data along with synchronization messages. Below, we’ll inspect the files included in the dataset:
asc_files = list(dataset.path.glob('**/*.asc'))
asc_files
[PosixPath('data/ToyDatasetEyeLink/raw/pymovements-toy-dataset-eyelink-main/raw/subject_2_session_1.asc'),
PosixPath('data/ToyDatasetEyeLink/raw/pymovements-toy-dataset-eyelink-main/raw/subject_1_session_1.asc')]
Let’s display the first 20 lines of one of the files to get a sense of its structure:
!head -n 20 data/ToyDatasetEyeLink/raw/aeye-lab-pymovements-toy-dataset-eyelink-a970d09/raw/subject_1_session_1.asc
head: cannot open 'data/ToyDatasetEyeLink/raw/aeye-lab-pymovements-toy-dataset-eyelink-a970d09/raw/subject_1_session_1.asc' for reading: No such file or directory
We can see that this file is a converted version of an *.edf file created by EyeLink.
Let’s try loading one of these files directly using pm.gaze.from_asc:
Loading eye-tracking data from a file#
Loading eye-tracking data is straightforward. You can load an .asc file with a single call to pm.gaze.from_asc:
gaze = pm.gaze.from_asc(file=asc_files[0])
gaze
-
DataFrame (3 columns, 109216 rows)shape: (109_216, 3)
time pupil pixel i64 f64 list[f64] 2762704 783.0 [139.1, 142.8] 2762705 783.0 [139.3, 142.8] 2762706 783.0 [139.5, 142.4] 2762707 783.0 [139.6, 141.9] 2762708 783.0 [139.5, 141.3] … … … 2903401 705.0 [762.7, 605.5] 2903402 706.0 [762.6, 605.2] 2903403 706.0 [762.5, 605.0] 2903404 706.0 [762.7, 604.9] 2903405 705.0 [763.0, 604.9] -
EventsEvents
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
NoneNone
-
-
NoneNone
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
TrueTrue
-
'EyeLink Portable Duo''EyeLink Portable Duo'
-
'Desktop''Desktop'
-
FalseFalse
-
1000.01000.0
-
'EyeLink''EyeLink'
-
'6.12''6.12'
-
-
1000.01000.0
-
ScreenScreen
-
NoneNone
-
NoneNone
-
10241024
-
NoneNone
-
NoneNone
-
12801280
-
-
This function automatically loads the raw eye-tracking data and attempts to infer the experimental settings used.
Let’s inspect a few rows from the resulting GazeDataFrame:
gaze.samples
| time | pupil | pixel |
|---|---|---|
| i64 | f64 | list[f64] |
| 2762704 | 783.0 | [139.1, 142.8] |
| 2762705 | 783.0 | [139.3, 142.8] |
| 2762706 | 783.0 | [139.5, 142.4] |
| 2762707 | 783.0 | [139.6, 141.9] |
| 2762708 | 783.0 | [139.5, 141.3] |
| … | … | … |
| 2903401 | 705.0 | [762.7, 605.5] |
| 2903402 | 706.0 | [762.6, 605.2] |
| 2903403 | 706.0 | [762.5, 605.0] |
| 2903404 | 706.0 | [762.7, 604.9] |
| 2903405 | 705.0 | [763.0, 604.9] |
We can see that timestamps (column time), pupil diameter (column pupil), and raw pixel coordinates (column pixel) are extracted automatically.
Let’s now take a look at the experimental metadata that was retrieved:
gaze.experiment
-
EyeTrackerEyeTracker
-
TrueTrue
-
'EyeLink Portable Duo''EyeLink Portable Duo'
-
'Desktop''Desktop'
-
FalseFalse
-
1000.01000.0
-
'EyeLink''EyeLink'
-
'6.12''6.12'
-
-
1000.01000.0
-
ScreenScreen
-
NoneNone
-
NoneNone
-
10241024
-
NoneNone
-
NoneNone
-
12801280
-
All relevant experimental metadata have been successfully extracted, such as the eye tracker model and the screen resolution used during recording.
Loading eye-tracking data along with SR Research recording messages#
To extract all MSG-prefixed SR Research messages, simply pass True to the pm.gaze.from_asc. The messages are stored in gaze.messages:
gaze = pm.gaze.from_asc(file=asc_files[0], messages=True)
gaze.messages
| time | content |
|---|---|
| f64 | str |
| 2.695217e6 | "!CMD 0 select_parser_configura… |
| 2.695543e6 | "!CMD 0 fixation_update_interva… |
| 2.695544e6 | "!CMD 0 fixation_update_accumul… |
| 2.695546e6 | "!CMD 0 auto_calibration_messag… |
| 2.700119e6 | "DISPLAY_COORDS 0 0 1279 1023" |
| … | … |
| 2.904096e6 | "!V TRIAL_VAR forid " |
| 2.904097e6 | "!V TRIAL_VAR sessiontype " |
| 2.904098e6 | "!V TRIAL_VAR combinationid -32… |
| 2.904099e6 | "TRIAL_RESULT 0" |
| 2.904548e6 | "JUDO.STOP" |
We can also control which messages are parsed by specifying them in the messages argument. For example, to extract only trial-related messages containing the keyword TRIAL, we can do the following:
gaze = pm.gaze.from_asc(file=asc_files[0], messages=['TRIAL'])
gaze.messages
| time | content |
|---|---|
| f64 | str |
| 2.762689e6 | "TRIALID 0" |
| 2.811758e6 | "!V TRIAL_VAR Session_Name_ sub… |
| 2.811759e6 | "!V TRIAL_VAR Trial_Index_ 1" |
| 2.81176e6 | "!V TRIAL_VAR RT_KAROLINSKA -1" |
| 2.811761e6 | "!V TRIAL_VAR RESPONSE_KAROLINS… |
| … | … |
| 2.904095e6 | "!V TRIAL_VAR q00corrans " |
| 2.904096e6 | "!V TRIAL_VAR forid " |
| 2.904097e6 | "!V TRIAL_VAR sessiontype " |
| 2.904098e6 | "!V TRIAL_VAR combinationid -32… |
| 2.904099e6 | "TRIAL_RESULT 0" |
Defining custom patterns for data extraction#
Now let’s define our own patterns to extract additional information from the *.asc files and add them to the GazeDataFrame.
We can do this using the parameter patterns using pm.gaze.from_asc.
patterns accepts either a list of custom patterns to match additional columns or a key identifying predefined and eye-tracker-specific patterns.
Let’s define a set of custom patterns to extract more information from parsed messages and show the resulting GazeDataFrame:
patterns = [
{
'pattern': 'SYNCTIME_READING_SCREEN',
'column': 'task',
'value': 'reading',
},
{
'pattern': 'SYNCTIME_JUDO',
'column': 'task',
'value': 'judo',
},
r'TRIALID (?P<trial_id>\d+)',
]
gaze = pm.gaze.from_asc(file=asc_files[0], patterns=patterns)
gaze.samples
| time | pupil | task | trial_id | pixel |
|---|---|---|---|---|
| i64 | f64 | str | str | list[f64] |
| 2762704 | 783.0 | null | "0" | [139.1, 142.8] |
| 2762705 | 783.0 | null | "0" | [139.3, 142.8] |
| 2762706 | 783.0 | null | "0" | [139.5, 142.4] |
| 2762707 | 783.0 | null | "0" | [139.6, 141.9] |
| 2762708 | 783.0 | null | "0" | [139.5, 141.3] |
| … | … | … | … | … |
| 2903401 | 705.0 | "judo" | "12" | [762.7, 605.5] |
| 2903402 | 706.0 | "judo" | "12" | [762.6, 605.2] |
| 2903403 | 706.0 | "judo" | "12" | [762.5, 605.0] |
| 2903404 | 706.0 | "judo" | "12" | [762.7, 604.9] |
| 2903405 | 705.0 | "judo" | "12" | [763.0, 604.9] |
The examples above illustrate that patterns can be defined in different forms. Some patterns simply match a message and assign a fixed column value (see the first pattern above), while others use regular expressions to capture dynamic information—for instance, the trial_id in the last pattern.
Given the patterns defined above, we can see that the columns for task and trial_id has been added.
The trial_id was extracted from messages such as MSG 2762689 TRIALID 0, while the task value was obtained from messages like MSG 2814942 SYNCTIME_JUDO.
Writing a DatasetDefinition to parse the complete dataset#
Let’s create a custom DatasetDefinition to load all asc files, including the patterns we defined earlier.
First we create a ResourceDefinition that specifies how we want to load our asc files.
We can use the patterns that we identified and specify them as one of the load keyword arguments (load_kwargs).
In addition, we also define the filename pattern, which represents subject and session information encoded in the filename.
The datatypes of the additional metadata parsed from the filename can be specified via filename_pattern_schema_overrides.
resource_definition = pm.ResourceDefinition(
content='gaze',
filename_pattern=r'subject_{subject_id:d}_session_{session_id:d}.asc',
filename_pattern_schema_overrides={
'subject_id': int,
'session_id': int,
},
load_kwargs={
'patterns': patterns,
'schema': {'trial_id': int},
},
)
Next, we need to define the experiment:
experiment = pm.Experiment(
screen_width_px=1280,
screen_height_px=1024,
screen_width_cm=38,
screen_height_cm=30.2,
distance_cm=68,
origin='lower left',
sampling_rate=1000,
)
We now use these do write our DatasetDefinition. We choose ToyDatasetEyeLink as the name.
dataset_definition = pm.DatasetDefinition(
name='ToyDatasetEyeLink',
experiment=experiment,
resources=[resource_definition],
)
Let’s initialize a new Dataset and load the data using the dataset definition we just set up:
dataset = pm.Dataset(
definition=dataset_definition,
path='data/ToyDatasetEyeLink',
)
dataset.load()
-
DatasetDefinitionDatasetDefinition
-
NoneNone
-
NoneNone
-
NoneNone
-
NoneNone
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
NoneNone
-
NoneNone
-
NoneNone
-
NoneNone
-
10001000
-
NoneNone
-
NoneNone
-
-
10001000
-
ScreenScreen
-
6868
-
30.230.2
-
10241024
-
'lower left''lower left'
-
3838
-
12801280
-
-
-
NoneNone
-
dict (1 items)
-
str'subject_{subject_id:d}_session_{session_id:d}.asc'
-
-
dict (1 items)
-
dict (2 items)
-
<class 'int'><class 'int'>
-
<class 'int'><class 'int'>
-
-
-
TrueTrue
-
NoneNone
-
dict (0 items)
-
'ToyDatasetEyeLink''ToyDatasetEyeLink'
-
NoneNone
-
NoneNone
-
list (1 items)
-
ResourceDefinition
-
'gaze''gaze'
-
NoneNone
-
str'subject_{subject_id:d}_session_{session_id:d}.asc'
-
dict (2 items)
-
<class 'int'><class 'int'>
-
<class 'int'><class 'int'>
-
-
NoneNone
-
dict (2 items)
-
list (3 items)
-
'SYNCTIME_READING_SCREEN''SYNCTIME_READING_SCREEN'
-
'task''task'
- (1 more)
-
-
'SYNCTIME_JUDO''SYNCTIME_JUDO'
-
'task''task'
- (1 more)
-
- (1 more)
-
dict (1 items)
-
<class 'int'><class 'int'>
-
-
-
NoneNone
-
NoneNone
-
NoneNone
-
-
ResourceDefinition
-
NoneNone
-
NoneNone
-
NoneNone
-
NoneNone
-
-
tuple (2 items)
-
Events
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
NoneNone
-
-
Events
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
NoneNone
-
-
Events
-
dict (1 items)
-
DataFrame (3 columns, 2 rows)shape: (2, 3)
subject_id session_id filepath i64 i64 str 1 1 "pymovements-toy-dataset-eyelin… 2 1 "pymovements-toy-dataset-eyelin…
-
-
list (2 items)
-
Gaze
-
DataFrame (5 columns, 128342 rows)shape: (128_342, 5)
time pupil task trial_id pixel i64 f64 str i64 list[f64] 2154556 778.0 null 0 [138.1, 132.8] 2154557 778.0 null 0 [138.2, 132.7] 2154558 778.0 null 0 [138.2, 132.3] 2154559 778.0 null 0 [138.1, 131.9] 2154560 777.0 null 0 [137.9, 131.6] … … … … … 2339287 619.0 "judo" 12 [637.7, 531.7] 2339288 619.0 "judo" 12 [637.9, 531.8] 2339289 618.0 "judo" 12 [637.8, 531.6] 2339290 618.0 "judo" 12 [637.6, 531.4] 2339291 618.0 "judo" 12 [637.3, 531.2] -
EventsEvents
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
NoneNone
-
-
NoneNone
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
TrueTrue
-
'EyeLink Portable Duo''EyeLink Portable Duo'
-
'Desktop''Desktop'
-
FalseFalse
-
10001000
-
'EyeLink''EyeLink'
-
'6.12''6.12'
-
-
10001000
-
ScreenScreen
-
6868
-
30.230.2
-
10241024
-
'lower left''lower left'
-
3838
-
12801280
-
-
-
-
Gaze
-
DataFrame (5 columns, 109216 rows)shape: (109_216, 5)
time pupil task trial_id pixel i64 f64 str i64 list[f64] 2762704 783.0 null 0 [139.1, 142.8] 2762705 783.0 null 0 [139.3, 142.8] 2762706 783.0 null 0 [139.5, 142.4] 2762707 783.0 null 0 [139.6, 141.9] 2762708 783.0 null 0 [139.5, 141.3] … … … … … 2903401 705.0 "judo" 12 [762.7, 605.5] 2903402 706.0 "judo" 12 [762.6, 605.2] 2903403 706.0 "judo" 12 [762.5, 605.0] 2903404 706.0 "judo" 12 [762.7, 604.9] 2903405 705.0 "judo" 12 [763.0, 604.9] -
EventsEvents
-
DataFrame (4 columns, 0 rows)shape: (0, 4)
name onset offset duration str i64 i64 i64 -
NoneNone
-
-
NoneNone
-
ExperimentExperiment
-
EyeTrackerEyeTracker
-
TrueTrue
-
'EyeLink Portable Duo''EyeLink Portable Duo'
-
'Desktop''Desktop'
-
FalseFalse
-
10001000
-
'EyeLink''EyeLink'
-
'6.12''6.12'
-
-
10001000
-
ScreenScreen
-
6868
-
30.230.2
-
10241024
-
'lower left''lower left'
-
3838
-
12801280
-
-
-
-
Gaze
-
PosixPath('data/ToyDatasetEyeLink')PosixPath('data/ToyDatasetEyeLink')
-
DatasetPathsDatasetPaths
-
PosixPath('data/ToyDatasetEyeLink')PosixPath('data/ToyDatasetEyeLink')
-
PosixPath('data/ToyDatasetEyeLink/downloads')PosixPath('data/ToyDatasetEyeLink/downloads')
-
PosixPath('data/ToyDatasetEyeLink/events')PosixPath('data/ToyDatasetEyeLink/events')
-
PosixPathPosixPath('data/ToyDatasetEyeLink/precomputed_events')
-
PosixPathPosixPath('data/ToyDatasetEyeLink/precomputed_reading_measures')
-
PosixPath('data/ToyDatasetEyeLink/preprocessed')PosixPath('data/ToyDatasetEyeLink/preprocessed')
-
PosixPath('data/ToyDatasetEyeLink/raw')PosixPath('data/ToyDatasetEyeLink/raw')
-
PosixPath('data/ToyDatasetEyeLink')PosixPath('data/ToyDatasetEyeLink')
-
-
list (0 items)
-
list (0 items)
Let’s inspect the first Gaze in this dataset:
dataset.gaze[0].samples
| time | pupil | task | trial_id | pixel |
|---|---|---|---|---|
| i64 | f64 | str | i64 | list[f64] |
| 2154556 | 778.0 | null | 0 | [138.1, 132.8] |
| 2154557 | 778.0 | null | 0 | [138.2, 132.7] |
| 2154558 | 778.0 | null | 0 | [138.2, 132.3] |
| 2154559 | 778.0 | null | 0 | [138.1, 131.9] |
| 2154560 | 777.0 | null | 0 | [137.9, 131.6] |
| … | … | … | … | … |
| 2339287 | 619.0 | "judo" | 12 | [637.7, 531.7] |
| 2339288 | 619.0 | "judo" | 12 | [637.9, 531.8] |
| 2339289 | 618.0 | "judo" | 12 | [637.8, 531.6] |
| 2339290 | 618.0 | "judo" | 12 | [637.6, 531.4] |
| 2339291 | 618.0 | "judo" | 12 | [637.3, 531.2] |
What you have learned in this tutorial:#
how to handle
*.ascfileshow to create a custom dataset loading all files and parsing custom messages
how to load the dataset into your working memory