Skip to content

Adding user feature reduce model performance? #38

@williamcao-01

Description

@williamcao-01

Hi everyone,

I am trying to use ml-1m data to build a rs model for users. What is weird for me is that, the model has a better performance without using the user features. Did i do something wrong when adding the features or is this normal?

Fitting the dataset
dataset = Dataset() dataset.fit(users = (row['UserID'] for index,row in users_df.iterrows()), items = (row['MovieID'] for index,row in movie_df.iterrows()), user_features = set(user_features_flat))

Creating the interaction and feature matrix
(interactions, weights) = dataset.build_interactions((row['UserID'],row['MovieID'],row['rating']) for index,row in ratings_df.iterrows())
user_feature_matrix = dataset.build_user_features((row['UserID'], [row['Gender'],row['Occupation'],row['age_group']]) for index,row in users.iterrows())

Model with user features
model = LightFM(no_components=70, loss='warp',) model.fit(interactions, user_features=user_feature_matrix, item_features=None, sample_weight=None, epochs=70, num_threads=4)
p_k = evaluation.precision_at_k(model, test, k=10, user_features=user_feature_matrix, item_features=None, preserve_rows=False, num_threads=4, check_intersections=True).mean() p_k #0.14658715

Model without
model_cf = LightFM(no_components=70, loss='warp') model_cf.fit(interactions, user_features=None, item_features=None, sample_weight=None, epochs=70, num_threads=4)
p_k_cf = evaluation.precision_at_k(model_cf, test, k=10, user_features=None, item_features=None, preserve_rows=False, num_threads=4, check_intersections=True).mean() p_k_cf #0.1638668

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions