Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add other attack mechanisms #2

Open
100 opened this issue Aug 9, 2018 · 0 comments
Open

Add other attack mechanisms #2

100 opened this issue Aug 9, 2018 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@100
Copy link
Member

100 commented Aug 9, 2018

Right now we assume no feedback between adversary and classifier.

What if the adversary has access to the labels? What if the adversary has access to the raw probabilities? What is the adversary has access to some observation that can be linked back to the label or probability?

These are very broad, and while some have been addressed in machine learning literature, there are many possible takes on this as it specifically applies to text classification.

Potential ideas (this list will grow):

  • Use Lime to identify words that are important to classification results and apply targeted attacks
  • Simulate a sequence of back-and-forths between classifier and adversary
@100 100 added the enhancement New feature or request label Aug 9, 2018
@100 100 self-assigned this Aug 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant