Synthetic¶
Module containing functions for generating synthetic datasets with known properties, for model testing and experimentation.

spotlight.datasets.synthetic.
generate_sequential
(num_users=100, num_items=1000, num_interactions=10000, concentration_parameter=0.1, order=3, random_state=None)[source]¶ Generate a dataset of useritem interactions where sequential information matters.
The interactions are generated by a nth order Markov chain with a uniform stationary distribution, where transition probabilities are given by doublystochastic transition matrix. For nth order chains, transition probabilities are a convex combination of the transition probabilities of the last n states in the chain.
The transition matrix is sampled from a Dirichlet distribution described by a constant concentration parameter. Concentration parameters closer to zero generate more predictable sequences.
Parameters:  num_users (int, optional) – number of users in the dataset
 num_items (int, optional) – number of items (Markov states) in the dataset
 num_interactions (int, optional) – number of interactions to generate
 concentration_parameter (float, optional) – Controls how predictable the sequence is. Values closer to zero give more predictable sequences.
 order (int, optional) – order of the Markov chain
 random_state (numpy.random.RandomState, optional) – random state used to generate the data
Returns: Interactions – instance of the interactions class
Return type: