KeyError because of mismatch between outcome and positive label in data file #1

KasperFyhn · 2023-10-06T11:35:33Z

I get a KeyError in supervised_classification.py:77 which you seem to have anticipated.

In the block below, **report[pos_label] gives an error if your data file (as the one I got my hands on from you) contains 1's and 0's in the column corresponding to the outcome variable, e.g. political, instead of the actual positive label which the code seems to assume.

classifier.fit(X[train_index], y[train_index])
y_pred = classifier.predict(X[test_index])
report = classification_report(
    y[test_index], y_pred, output_dict=True
)
# The positive label is the same as the column name.
# But THIS MIGHT CHANGE so beware that then
# we have to relax the assumption in the next line.
pos_label = outcome
record = {
    "model": model,
    "outcome": outcome,
    "fold": i_fold,
    "accuracy": report["accuracy"],
    **report[pos_label],
}
records.append(record)

Looks like an easy fix.

The text was updated successfully, but these errors were encountered:

KasperFyhn added the bug Something isn't working label Oct 6, 2023

KasperFyhn assigned x-tabdeveloping and miscodisco Oct 6, 2023

KasperFyhn mentioned this issue Oct 10, 2023

Add specification of input file format #3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KeyError because of mismatch between outcome and positive label in data file #1

KeyError because of mismatch between outcome and positive label in data file #1

KasperFyhn commented Oct 6, 2023 •

edited

Loading

KeyError because of mismatch between outcome and positive label in data file #1

KeyError because of mismatch between outcome and positive label in data file #1

Comments

KasperFyhn commented Oct 6, 2023 • edited Loading

KasperFyhn commented Oct 6, 2023 •

edited

Loading