CoLAGaze#

class pymovements.datasets.CoLAGaze(name: str = 'CoLAGaze', *, long_name: str = 'Corpus of Eye Movements for Linguistic Acceptability', mirrors: dict[str, Sequence[str]] = <factory>, resources: ResourceDefinitions = <factory>, experiment: Experiment = <factory>, extract: dict[str, bool] | None = None, custom_read_kwargs: dict[str, dict[str, Any]] | None = None, column_map: dict[str, str] | None = None, trial_columns: list[str] | None = None, time_column: str | None = None, time_unit: str | None = None, pixel_columns: list[str] | None = None, position_columns: list[str] | None = None, velocity_columns: list[str] | None = None, acceleration_columns: list[str] | None = None, distance_column: str | None = None, filename_format: dict[str, str] | None = None, filename_format_schema_overrides: dict[str, dict[str, type]] | None = None)[source]#

CoLAGaze dataset [Bondar et al., 2025].

This dataset includes eye-tracking data from native speakers of English reading sentences from the CoLA dataset. Eye movements are recorded at a sampling frequency of 2,000 Hz using an EyeLink 1000 eye tracker and are provided as pixel coordinates.

Check the respective paper for details [Bondar et al., 2025].

Warning

This dataset currently cannot be fully processed by pymovements due to an error during parsing of individual files.

See issue #1401 for reference.

name#

The name of the dataset.

Type:: str

long_name#

The entire name of the dataset.

Type:: str

resources#

A list of dataset gaze_resources. Each list entry must be a dictionary with the following keys: - resource: The url suffix of the resource. This will be concatenated with the mirror. - filename: The filename under which the file is saved as. - md5: The MD5 checksum of the respective file.

Type:: ResourceDefinitions

experiment#

The experiment definition.

Type:: Experiment

filename_format#

Regular expression, which will be matched before trying to load the file. Namedgroups will appear in the fileinfo dataframe.

Type:: dict[str, str] | None

filename_format_schema_overrides#

If named groups are present in the filename_format, this makes it possible to cast specific named groups to a particular datatype.

Type:: dict[str, dict[str, type]] | None

custom_read_kwargs#

If specified, these keyword arguments will be passed to the file reading function. (default: None)

Type:: dict[str, dict[str, Any]] | None

Examples

Initialize your Dataset object with the CoLAGaze definition:

>>> import pymovements as pm
>>>
>>> dataset = pm.Dataset("CoLAGaze", path='data/CoLAGaze')

Download the dataset resources:

>>> dataset.download()

Load the data into memory:

>>> dataset.load()

Methods

`__init__`([name, long_name, mirrors, ...])
`from_yaml`(path)	Load a dataset definition from a YAML file.
`to_dict`(*[, exclude_private, exclude_none])	Return dictionary representation.
`to_yaml`(path, *[, exclude_private, exclude_none])	Save a dataset definition to a YAML file.

Attributes

`acceleration_columns`
`column_map`
`custom_read_kwargs`
`distance_column`
`extract`
`filename_format`
`filename_format_schema_overrides`
`has_resources`	Checks for resources in `resources`.
`long_name`
`name`
`pixel_columns`
`position_columns`
`time_column`
`time_unit`
`trial_columns`
`velocity_columns`
`resources`
`experiment`
`mirrors`

CoLAGaze#

This Page