{ "cells": [ { "cell_type": "markdown", "id": "0", "metadata": {}, "source": [ "# Parsing SR Research EyeLink Data" ] }, { "cell_type": "markdown", "id": "1", "metadata": {}, "source": [ "## What you will learn in this tutorial:\n", "\n", "* how to parse raw eye tracking files created with SR Research EyeLink\n", "* how to extract experiment information using patterns\n", "* how to create a custom dataset definition to load a complete dataset of multiple files" ] }, { "cell_type": "markdown", "id": "2", "metadata": {}, "source": [ "## Preparations\n", "\n", "We import `pymovements` as the alias `pm` for convenience." ] }, { "cell_type": "code", "execution_count": null, "id": "3", "metadata": {}, "outputs": [], "source": [ "import pymovements as pm" ] }, { "cell_type": "markdown", "id": "4", "metadata": {}, "source": [ "Let's start by downloading a toy dataset `ToyDatasetEyeLink` that contains `*.asc` files:" ] }, { "cell_type": "code", "execution_count": null, "id": "5", "metadata": {}, "outputs": [], "source": [ "dataset = pm.Dataset(\"ToyDatasetEyeLink\", path='data/ToyDatasetEyeLink')\n", "dataset.download()" ] }, { "cell_type": "markdown", "id": "6", "metadata": {}, "source": [ "This dataset includes `*.asc` files that store raw eye-tracking data along with synchronization messages. Below, we’ll inspect the files included in the dataset:" ] }, { "cell_type": "code", "execution_count": null, "id": "7", "metadata": {}, "outputs": [], "source": [ "asc_files = list(dataset.path.glob('**/*.asc'))\n", "asc_files" ] }, { "cell_type": "markdown", "id": "8", "metadata": {}, "source": [ "Let’s display the first 20 lines of one of the files to get a sense of its structure:" ] }, { "cell_type": "code", "execution_count": null, "id": "9", "metadata": {}, "outputs": [], "source": [ "!head -n 20 data/ToyDatasetEyeLink/raw/aeye-lab-pymovements-toy-dataset-eyelink-a970d09/raw/subject_1_session_1.asc" ] }, { "cell_type": "markdown", "id": "10", "metadata": {}, "source": [ "We can see that this file is a converted version of an `*.edf` file created by EyeLink.\n", "\n", "Let’s try loading one of these files directly using `pm.gaze.from_asc`:" ] }, { "cell_type": "markdown", "id": "11", "metadata": {}, "source": [ "### Loading eye-tracking data from a file\n", "Loading eye-tracking data is straightforward. You can load an `.asc` file with a single call to `pm.gaze.from_asc`:" ] }, { "cell_type": "code", "execution_count": null, "id": "12", "metadata": {}, "outputs": [], "source": [ "gaze = pm.gaze.from_asc(file=asc_files[0])\n", "gaze" ] }, { "cell_type": "markdown", "id": "13", "metadata": {}, "source": [ "This function automatically loads the raw eye-tracking data and attempts to infer the experimental settings used.\n", "\n", "Let’s inspect a few rows from the resulting `GazeDataFrame`:" ] }, { "cell_type": "code", "execution_count": null, "id": "14", "metadata": {}, "outputs": [], "source": [ "gaze.samples" ] }, { "cell_type": "markdown", "id": "15", "metadata": {}, "source": [ "We can see that timestamps (column time), pupil diameter (column pupil), and raw pixel coordinates (column pixel) are extracted automatically.\n", "\n", "Let’s now take a look at the experimental metadata that was retrieved:" ] }, { "cell_type": "code", "execution_count": null, "id": "16", "metadata": {}, "outputs": [], "source": [ "gaze.experiment" ] }, { "cell_type": "markdown", "id": "17", "metadata": {}, "source": [ "All relevant experimental metadata have\n", " been successfully extracted, such as the eye tracker model and the screen resolution used during recording." ] }, { "cell_type": "markdown", "id": "18", "metadata": {}, "source": [ "### Loading eye-tracking data along with SR Research recording messages\n", "To extract all `MSG`-prefixed SR Research messages, simply pass `True` to the `pm.gaze.from_asc`. The messages are stored in `gaze.messages`:" ] }, { "cell_type": "code", "execution_count": null, "id": "19", "metadata": {}, "outputs": [], "source": [ "gaze = pm.gaze.from_asc(file=asc_files[0], messages=True)\n", "gaze.messages" ] }, { "cell_type": "markdown", "id": "20", "metadata": {}, "source": [ "We can also control which messages are parsed by specifying them in the `messages` argument. For example, to extract only trial-related messages containing the keyword `TRIAL`, we can do the following:" ] }, { "cell_type": "code", "execution_count": null, "id": "21", "metadata": {}, "outputs": [], "source": [ "gaze = pm.gaze.from_asc(file=asc_files[0], messages=['TRIAL'])\n", "gaze.messages" ] }, { "cell_type": "markdown", "id": "22", "metadata": {}, "source": [ "### Defining custom patterns for data extraction\n", "\n", "Now let’s define our own patterns to extract additional information from the `*.asc` files and add them to the `GazeDataFrame`.\n", "We can do this using the parameter `patterns` using `pm.gaze.from_asc`.\n", "\n", "`patterns` accepts either a list of custom patterns to match additional columns or a key identifying predefined and eye-tracker-specific patterns.\n", "\n", "Let’s define a set of custom patterns to extract more information from parsed messages and show the resulting `GazeDataFrame`:" ] }, { "cell_type": "code", "execution_count": null, "id": "23", "metadata": {}, "outputs": [], "source": [ "patterns = [\n", " {\n", " 'pattern': 'SYNCTIME_READING_SCREEN',\n", " 'column': 'task',\n", " 'value': 'reading',\n", " },\n", " {\n", " 'pattern': 'SYNCTIME_JUDO',\n", " 'column': 'task',\n", " 'value': 'judo',\n", " },\n", " r'TRIALID (?P\\d+)',\n", "]\n", "\n", "gaze = pm.gaze.from_asc(file=asc_files[0], patterns=patterns)\n", "gaze.samples" ] }, { "cell_type": "markdown", "id": "24", "metadata": {}, "source": [ "The examples above illustrate that patterns can be defined in different forms. Some patterns simply match a message and assign a fixed column value (see the first pattern above), while others use regular expressions to capture dynamic information—for instance, the `trial_id` in the last pattern.\n", "\n", "Given the patterns defined above, we can see that the columns for `task` and `trial_id` has been added.\n", "\n", "The `trial_id` was extracted from messages such as `MSG 2762689 TRIALID 0`, while the task value was obtained from messages like `MSG 2814942 SYNCTIME_JUDO`." ] }, { "cell_type": "markdown", "id": "25", "metadata": {}, "source": [ "### Writing a DatasetDefinition to parse the complete dataset \n", "Let’s create a custom `DatasetDefinition` to load all `asc` files, including the patterns we defined earlier.\n", "\n", "First we create a `ResourceDefinition` that specifies how we want to load our `asc` files.\n", "We can use the `patterns` that we identified and specify them as one of the load keyword arguments (`load_kwargs`).\n", "\n", "In addition, we also define the filename pattern, which represents subject and session information encoded in the filename.\n", "The datatypes of the additional metadata parsed from the filename can be specified via `filename_pattern_schema_overrides`.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "26", "metadata": {}, "outputs": [], "source": [ "resource_definition = pm.ResourceDefinition(\n", " content='gaze',\n", " filename_pattern=r'subject_{subject_id:d}_session_{session_id:d}.asc',\n", " filename_pattern_schema_overrides={\n", " 'subject_id': int,\n", " 'session_id': int,\n", " },\n", " load_kwargs={\n", " 'patterns': patterns,\n", " 'schema': {'trial_id': int},\n", " },\n", ")" ] }, { "cell_type": "markdown", "id": "27", "metadata": {}, "source": [ "Next, we need to define the experiment:" ] }, { "cell_type": "code", "execution_count": null, "id": "28", "metadata": {}, "outputs": [], "source": [ "experiment = pm.Experiment(\n", " screen_width_px=1280,\n", " screen_height_px=1024,\n", " screen_width_cm=38,\n", " screen_height_cm=30.2,\n", " distance_cm=68,\n", " origin='lower left',\n", " sampling_rate=1000,\n", ")" ] }, { "cell_type": "markdown", "id": "29", "metadata": {}, "source": [ "We now use these do write our `DatasetDefinition`. We choose `ToyDatasetEyeLink` as the name." ] }, { "cell_type": "code", "execution_count": null, "id": "30", "metadata": {}, "outputs": [], "source": [ "dataset_definition = pm.DatasetDefinition(\n", " name='ToyDatasetEyeLink',\n", " experiment=experiment,\n", " resources=[resource_definition],\n", ")" ] }, { "cell_type": "markdown", "id": "31", "metadata": {}, "source": [ "Let’s initialize a new `Dataset` and load the data using the dataset definition we just set up:" ] }, { "cell_type": "code", "execution_count": null, "id": "32", "metadata": {}, "outputs": [], "source": [ "dataset = pm.Dataset(\n", " definition=dataset_definition,\n", " path='data/ToyDatasetEyeLink',\n", ")\n", "dataset.load()" ] }, { "cell_type": "markdown", "id": "33", "metadata": {}, "source": [ "Let’s inspect the first `Gaze` in this dataset:" ] }, { "cell_type": "code", "execution_count": null, "id": "34", "metadata": {}, "outputs": [], "source": [ "dataset.gaze[0].samples" ] }, { "cell_type": "markdown", "id": "35", "metadata": {}, "source": [ "## What you have learned in this tutorial:\n", "\n", "* how to handle `*.asc` files\n", "* how to create a custom dataset loading all files and parsing custom messages\n", "* how to load the dataset into your working memory" ] } ], "metadata": {}, "nbformat": 4, "nbformat_minor": 5 }