Seeded Dataset
LeWiDi Moderation
Content moderation · imported 2026-05-02 · sha256:1c83…77ee
← DatasetsSeeded Demo
Records
412,330
Actors
1,842
Outcomes
Annotator consensus
Imported
2026-05-02
Status
adapted
Schema Mapping
source → DecisionEvent| source field | DecisionEvent field | |
|---|---|---|
| annotator_id | → | actor_id |
| label | → | decision |
| consensus_label | → | ground_truth |
| consensus_n | → | consensus_n |
| confidence | → | context_features.self_confidence |
Data Quality
Actor Coverage94.0%
Outcome Coverage71.0%
Temporal Coverage88.0%
Ground Truth Confidence72.0%
Calibration Readiness
78
/ 100
Ready w/ caveats
Annotators have stable IDs and repeated assignments. Truth is consensus-derived, which introduces a confidence ceiling on ECE estimates.
Blocking Issues
- · Consensus-based truth only; no external outcome resolution
Sample Records
first 3 raw rows| item_id | annotator_id | label | confidence | consensus_label | consensus_n |
|---|---|---|---|---|---|
| i_4421 | r_127 | unsafe | 0.82 | borderline | 7 |
| i_4421 | r_044 | borderline | 0.61 | borderline | 7 |
| i_4422 | r_127 | safe | 0.7 | safe | 5 |