Skip to content

Moderation proof of concept implementation #9461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 33 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
3978d9e
Create migration for new moderation column
marcospri Apr 4, 2025
a3e3c16
Model changes for new annotation_moderation column
marcospri Apr 4, 2025
336d097
Basic write change for the new moderation_status column
marcospri Apr 4, 2025
d4cca1e
Migration to backfill status based on existing rows
marcospri Apr 4, 2025
8c2245d
Migration for Group.pre_moderated
marcospri Apr 4, 2025
0ce47b2
Model changes for Group.pre_moderated
marcospri Apr 4, 2025
c52d4d4
Basic status changes
marcospri Apr 4, 2025
660d458
API
marcospri Apr 7, 2025
9305a30
FRONT END
marcospri Apr 7, 2025
d157b9b
Formatting
marcospri Apr 7, 2025
d348dc6
FRONTEND
marcospri Apr 7, 2025
7a12fbb
Static
marcospri Apr 7, 2025
75f4119
MODel
marcospri Apr 7, 2025
6a3eafe
API
marcospri Apr 7, 2025
882a229
F MACHINE
marcospri Apr 7, 2025
f3c7e89
Status changes
marcospri Apr 7, 2025
28e5f0b
API
marcospri Apr 7, 2025
8d76500
Add some tests and fix all the code
marcospri Apr 7, 2025
5100cca
Add pre_moderation on the right model
marcospri Apr 7, 2025
fe9fbf2
Index moderation status
marcospri Apr 7, 2025
09c6076
Dont include PRIVATE on moderation status
marcospri Apr 7, 2025
f19a24a
Sort annotations in new endpoint
marcospri Apr 7, 2025
ee8ff56
Filter annotations by status
marcospri Apr 7, 2025
fc1312f
Don't expose text and tags of moderated annos
marcospri Apr 8, 2025
f4fa8dd
Use service method for hide/unhide
marcospri Apr 8, 2025
9fc40d7
Centralize moderation status changes
marcospri Apr 8, 2025
75a47a3
Keep an log of moderation changes
marcospri Apr 8, 2025
fc4f741
Migration for moderation log
marcospri Apr 8, 2025
922fded
Store moderation status changes
marcospri Apr 8, 2025
b4ba3b9
Filter annos after a status change
marcospri Apr 9, 2025
c106ca4
Clean up
marcospri Apr 9, 2025
a6671c5
Func test
marcospri Apr 9, 2025
2eb14c5
Add moderation link on header
marcospri Apr 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion h/db/mixins.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,16 @@
import sqlalchemy as sa


class Timestamps:
class CreatedMixin:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting this in two, we not always need updated but if we have created we most likely want also updated.

created = sa.Column(
sa.DateTime,
default=datetime.datetime.utcnow,
server_default=sa.func.now(),
nullable=False,
)


class Timestamps(CreatedMixin):
updated = sa.Column(
sa.DateTime,
server_default=sa.func.now(),
Expand Down
7 changes: 6 additions & 1 deletion h/events.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
from typing import Literal

AnnotationAction = Literal["create", "update", "delete"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember why I did this but I reckon I was looking for possible values



class AnnotationEvent:
"""An event representing an action on an annotation."""

def __init__(self, request, annotation_id, action):
def __init__(self, request, annotation_id, action: AnnotationAction):
self.request = request
self.annotation_id = annotation_id
self.action = action
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
"""Backfill DENIED based on moderation."""

import sqlalchemy as sa
from alembic import op

revision = "96cde96b2fd7"
down_revision = "c8f748cbfb8f"


def upgrade() -> None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This migrates existing AnnotationModeration rows to the moderation_status column.

conn = op.get_bind()

result = conn.execute(
sa.text(
"""
UPDATE annotation
SET moderation_status = 'DENIED'
FROM annotation_moderation
WHERE annotation.id = annotation_moderation.annotation_id
AND annotation.moderation_status is null
"""
)
)
print("\tUpdated annotation rows as DENIED:", result.rowcount) # noqa: T201

result = conn.execute(
sa.text(
"""
UPDATE annotation_slim
SET moderation_status = 'DENIED'
FROM annotation_moderation
WHERE annotation_slim.pubid = annotation_moderation.annotation_id
AND annotation_slim.moderation_status is null
"""
)
)
print("\tUpdated annotations as DENIED:", result.rowcount) # noqa: T201
Copy link
Member Author

@marcospri marcospri Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Also backfill moderation_log based on AnnotationModeration.

This is not going to be possible as we don't have the user who hide them.

We could:

  • Make user nullable on moderation_log.
  • Just don't migrate these to moderation_log
  • Pick one user, the creator of the group? Could be confusing on the future.



def downgrade() -> None:
pass
47 changes: 47 additions & 0 deletions h/migrations/versions/bd226cc1c359_create_moderation_log_table.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
"""Create moderation log table."""

import sqlalchemy as sa
from alembic import op

from h.db import types

revision = "bd226cc1c359"
down_revision = "96cde96b2fd7"


def upgrade() -> None:
op.create_table(
"moderation_log",
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
sa.Column("user_id", sa.Integer(), nullable=False),
sa.Column("annotation_id", types.URLSafeUUID(), nullable=False),
sa.Column("old_moderation_status", sa.String(), nullable=False),
sa.Column("new_moderation_status", sa.String(), nullable=False),
sa.Column(
"created", sa.DateTime(), server_default=sa.text("now()"), nullable=False
),
sa.ForeignKeyConstraint(
["annotation_id"],
["annotation.id"],
name=op.f("fk__moderation_log__annotation_id__annotation"),
ondelete="CASCADE",
),
sa.ForeignKeyConstraint(
["user_id"],
["user.id"],
name=op.f("fk__moderation_log__user_id__user"),
ondelete="CASCADE",
),
sa.PrimaryKeyConstraint("id", name=op.f("pk__moderation_log")),
)
op.create_index(
op.f("ix__moderation_log_annotation_id"),
"moderation_log",
["annotation_id"],
unique=False,
)


def downgrade() -> None:
op.drop_table("moderation_log")
# ### end Alembic commands ###
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
"""Create the moderation_status column."""

import sqlalchemy as sa
from alembic import op
from sqlalchemy.dialects import postgresql

revision = "c8f748cbfb8f"
down_revision = "cf4eedee60f7"


def upgrade() -> None:
moderation_status_type = postgresql.ENUM(
"APPROVED",
"DENIED",
"SPAM",
"PENDING",
name="moderationstatus",
)
moderation_status_type.create(op.get_bind(), checkfirst=True)
op.add_column(
"annotation",
sa.Column(
"moderation_status",
moderation_status_type,
nullable=True,
),
)
op.add_column(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • AnnotationSlim has a boolean moderated column. I'm going to keep it like that but take into consideration the new Annotation.moderation_status to keep them in sync.

We should have a spike/discussion about AnnotationSlim. We did some work there but we haven't committed to move the codebase over that table.

My current thinking is that we should:

  • Move annotation_metadata to be related to annotation.id
  • Make a assessment of current slow queries, existing indexes in annotation and make the necessary adjustments.
  • Move existing queries of AnnotationSlim back to Annotation.

I don't think we should block the pre-moderation work over this but it something we could work in parallel or right afterwards.

"annotation_slim",
sa.Column(
"moderation_status",
moderation_status_type,
nullable=True,
),
)


def downgrade() -> None:
op.drop_column("annotation_slim", "moderation_status")
op.drop_column("annotation", "moderation_status")
15 changes: 15 additions & 0 deletions h/migrations/versions/cf4eedee60f7_add_group_pre_moderated.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
"""Add Group.pre_moderated."""

import sqlalchemy as sa
from alembic import op

revision = "cf4eedee60f7"
down_revision = "9d97a3e4921e"


def upgrade() -> None:
op.add_column("group", sa.Column("pre_moderated", sa.Boolean(), nullable=True))


def downgrade() -> None:
op.drop_column("group", "pre_moderated")
32 changes: 26 additions & 6 deletions h/models/annotation.py
Original file line number Diff line number Diff line change
@@ -1,20 +1,32 @@
import datetime
from enum import Enum
from uuid import UUID

import sqlalchemy as sa
from sqlalchemy.dialects import postgresql as pg
from sqlalchemy.ext.hybrid import hybrid_property
from sqlalchemy.ext.mutable import MutableDict, MutableList
from sqlalchemy.orm import Mapped, relationship

from h.db import Base, types
from h.models.group import Group
from h.util import markdown_render, uri
from h.util.user import split_user


class ModerationStatus(Enum):
APPROVED = "APPROVED"
PENDING = "PENDING"
DENIED = "DENIED"
SPAM = "SPAM"


class Annotation(Base):
"""Model class representing a single annotation."""

# Expose the ModerationStatus directly here
ModerationStatus = ModerationStatus

__tablename__ = "annotation"
__table_args__ = (
# Tags are stored in an array-type column, and indexed using a
Expand Down Expand Up @@ -68,7 +80,7 @@ class Annotation(Base):
index=True,
)

group = sa.orm.relationship(
group = relationship(
Group,
primaryjoin=(Group.pubid == groupid),
foreign_keys=[groupid],
Expand Down Expand Up @@ -138,11 +150,11 @@ class Annotation(Base):
uselist=True,
)

mentions = sa.orm.relationship("Mention", back_populates="annotation")
mentions = relationship("Mention", back_populates="annotation")

notifications = sa.orm.relationship(
"Notification", back_populates="source_annotation"
)
notifications = relationship("Notification", back_populates="source_annotation")

moderation_status: Mapped[ModerationStatus | None]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an enum in postgres.

I generally try to avoid postgres enum because alembic is not 100% aware of them.

Here I reckon we should use them, they are very efficient storage-wise and the manual migrations we might have to write on them are actually trivial, is just that alembic doesn't pick them up:

ALTER TYPE moderationstatus ADD VALUE 'PENDING';   

Note that this a nullable field. This is necessary to create the new column but also the code assumes that null is either a private annotation or an approved one.


@property
def uuid(self):
Expand Down Expand Up @@ -258,8 +270,16 @@ def authority(self):
@property
def is_hidden(self):
"""Check if this annotation id is hidden."""

# TODO, move to the new column after migration and backfill migration
return self.moderation is not None

@property
def moderated(self):
# This replaces is_hidden, adding a new property to give more visibility to the change in the PoC
return bool(
self.moderation_status
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First place of two we need to account for moderation_status potentially being null meaning approved.

Here is not a problem because both private and approved mean not moderated anyway.

and self.moderation_status != ModerationStatus.APPROVED
)

def __repr__(self):
return f"<Annotation {self.id}>"
26 changes: 25 additions & 1 deletion h/models/annotation_moderation.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,31 @@
import sqlalchemy as sa
from sqlalchemy import ForeignKey
from sqlalchemy.orm import (
Mapped,
MappedAsDataclass,
mapped_column,
relationship,
)

from h.db import Base, types
from h.db.mixins import Timestamps
from h.db.mixins import CreatedMixin, Timestamps
from h.db.mixins_dataclasses import AutoincrementingIntegerID
from h.models.annotation import ModerationStatus


class ModerationLog(Base, AutoincrementingIntegerID, CreatedMixin, MappedAsDataclass):
__tablename__ = "moderation_log"

user_id: Mapped[int] = mapped_column(sa.ForeignKey("user.id", ondelete="CASCADE"))
user = relationship("User")

annotation_id: Mapped[types.URLSafeUUID] = mapped_column(
ForeignKey("annotation.id", ondelete="CASCADE"), index=True
)
annotation = relationship("Annotation")

old_moderation_status: Mapped[ModerationStatus | None] = mapped_column()
new_moderation_status: Mapped[ModerationStatus] = mapped_column()


class AnnotationModeration(Base, Timestamps):
Expand Down
13 changes: 8 additions & 5 deletions h/models/annotation_slim.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
import datetime

import sqlalchemy as sa
from sqlalchemy.orm import Mapped, relationship

from h.db import Base, types
from h.models import helpers
from h.models.annotation import Annotation


class AnnotationSlim(Base):
Expand Down Expand Up @@ -31,7 +33,7 @@ class AnnotationSlim(Base):
)
"""The value of annotation.id, named here pubid following the convention of group.pubid"""

annotation = sa.orm.relationship(
annotation = relationship(
"Annotation", backref=sa.orm.backref("slim", uselist=False)
)

Expand Down Expand Up @@ -64,6 +66,7 @@ class AnnotationSlim(Base):
default=False,
server_default=sa.sql.expression.false(),
)
moderation_status: Mapped[Annotation.ModerationStatus | None]

shared = sa.Column(
sa.Boolean,
Expand All @@ -78,26 +81,26 @@ class AnnotationSlim(Base):
nullable=False,
index=True,
)
document = sa.orm.relationship("Document")
document = relationship("Document")

user_id = sa.Column(
sa.Integer,
sa.ForeignKey("user.id", ondelete="CASCADE"),
nullable=False,
index=True,
)
user = sa.orm.relationship("User")
user = relationship("User")

group_id = sa.Column(
sa.Integer,
sa.ForeignKey("group.id", ondelete="CASCADE"),
nullable=False,
index=True,
)
group = sa.orm.relationship("Group")
group = relationship("Group")

# Using `meta` as `metadata` is reserved by SQLAlchemy
meta = sa.orm.relationship("AnnotationMetadata", uselist=False)
meta = relationship("AnnotationMetadata", uselist=False)

def __repr__(self):
return helpers.repr_(self, ["id"])
4 changes: 3 additions & 1 deletion h/models/group.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import slugify
import sqlalchemy as sa
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.orm import relationship
from sqlalchemy.orm import Mapped, mapped_column, relationship

from h import pubid
from h.db import Base, mixins
Expand Down Expand Up @@ -157,6 +157,8 @@ class Group(Base, mixins.Timestamps):
sa.Enum(WriteableBy, name="group_writeable_by"), nullable=True
)

pre_moderated: Mapped[bool | None] = mapped_column()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the simplest change of this stream of work.

I haven't modified the API to take/expose this.


@property
def groupid(self):
if self.authority_provided_id is None:
Expand Down
2 changes: 1 addition & 1 deletion h/presenters/annotation_searchindex.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def asdict(self):
def _add_hidden(self, result):
# Mark an annotation as hidden if it and all of it's children have been
# moderated and hidden.
parents_and_replies = [self.annotation.id] + self.annotation.thread_ids # noqa: RUF005
parents_and_replies = [self.annotation.id, *self.annotation.thread_ids]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO This method below is not reading from the new status column yet.

This has the effect that for example single annotation threads are not hidden from the response (but their content is hidden).

ann_mod_svc = self.request.find_service(name="annotation_moderation")
is_hidden = len(ann_mod_svc.all_hidden(parents_and_replies)) == len(
Expand Down
Loading
Loading