Interactions¶

Classes describing datasets of user-item interactions. Instances of these are returned by dataset-fetching and dataset-processing functions.

class spotlight.interactions.Interactions(user_ids, item_ids, ratings=None, timestamps=None, weights=None, num_users=None, num_items=None)[source]¶

Interactions object. Contains (at a minimum) pair of user-item interactions, but can also be enriched with ratings, timestamps, and interaction weights.

For implicit feedback scenarios, user ids and item ids should only be provided for user-item pairs where an interaction was observed. All pairs that are not provided are treated as missing observations, and often interpreted as (implicit) negative signals.

For explicit feedback scenarios, user ids, item ids, and ratings should be provided for all user-item-rating triplets that were observed in the dataset.

Parameters

user_ids (array of np.int32) – array of user ids of the user-item pairs
item_ids (array of np.int32) – array of item ids of the user-item pairs
ratings (array of np.float32, optional) – array of ratings
timestamps (array of np.int32, optional) – array of timestamps
weights (array of np.float32, optional) – array of weights
num_users (int, optional) – Number of distinct users in the dataset. Must be larger than the maximum user id in user_ids.
num_items (int, optional) – Number of distinct items in the dataset. Must be larger than the maximum item id in item_ids.

Variables

user_ids (array of np.int32) – array of user ids of the user-item pairs
item_ids (array of np.int32) – array of item ids of the user-item pairs
ratings (array of np.float32, optional) – array of ratings
timestamps (array of np.int32, optional) – array of timestamps
weights (array of np.float32, optional) – array of weights
num_users (int, optional) – Number of distinct users in the dataset.
num_items (int, optional) – Number of distinct items in the dataset.

to_sequence(max_sequence_length=10, min_sequence_length=None, step_size=None)[source]¶

Transform to sequence form.

User-item interaction pairs are sorted by their timestamps, and sequences of up to max_sequence_length events are arranged into a (zero-padded from the left) matrix with dimensions (num_sequences x max_sequence_length).

Valid subsequences of users’ interactions are returned. For example, if a user interacted with items [1, 2, 3, 4, 5], the returned interactions matrix at sequence length 5 and step size 1 will be be given by:

[[1, 2, 3, 4, 5],
 [0, 1, 2, 3, 4],
 [0, 0, 1, 2, 3],
 [0, 0, 0, 1, 2],
 [0, 0, 0, 0, 1]]

At step size 2:

[[1, 2, 3, 4, 5],
 [0, 0, 1, 2, 3],
 [0, 0, 0, 0, 1]]

Parameters

max_sequence_length (int, optional) – Maximum sequence length. Subsequences shorter than this will be left-padded with zeros.
min_sequence_length (int, optional) – If set, only sequences with at least min_sequence_length non-padding elements will be returned.
step-size (int, optional) – The returned subsequences are the effect of moving a a sliding window over the input. This parameter governs the stride of that window. Increasing it will result in fewer subsequences being returned.

Returns

sequence interactions – The resulting sequence interactions.

Return type

SequenceInteractions

tocoo()[source]¶: Transform to a scipy.sparse COO matrix.

tocsr()[source]¶: Transform to a scipy.sparse CSR matrix.

class spotlight.interactions.SequenceInteractions(sequences, user_ids=None, num_items=None)[source]¶

Interactions encoded as a sequence matrix.

Parameters

sequences (array of np.int32 of shape (num_sequences x max_sequence_length)) – The interactions sequence matrix, as produced by to_sequence()
num_items (int, optional) – The number of distinct items in the data

Variables

sequences (array of np.int32 of shape (num_sequences x max_sequence_length)) – The interactions sequence matrix, as produced by to_sequence()