A Survey on Truth Discovery

Metadane

  • Autorzy: Yaliang Li, Jing Gao, Chuishi Meng, Qi Li, Lu Su, Bo Zhao, Wei Fan, Jiawei Han
  • Rok: 2016
  • Źródło: ACM SIGKDD Explorations Newsletter, Vol. 17, No. 2, Pages 1-16
  • DOI/Link: 10.1145/2897350.2897352 (arXiv:1505.02463)
  • Status: to-read
  • Pochodzenie: Wyekstrahowane z phishchain-2022 ([5] - truth discovery algorithms from database community)
  • Tagi: to-read reference truth-discovery crowd-sourcing data-quality em-algorithm glad survey

Notatki

Publikacja dodana automatycznie z bibliografii.

Kontekst cytowania w PhishChain 2022:

  • Referenced jako [5] w Truth Discovery Module section
  • Key observation: Existing truth discovery algorithms (from database community) perform poorly on PhishChain problem
  • Reason: Traditional algorithms (EM, GLAD) assume majority of verifiers respond to each task ← NOT true dla URL verification

PhishChain findings:

  • PhishTank retrospective: only handful verify każdy URL mimo thousands total verifiers
  • Sparse verification scenario violates assumptions of EM, GLAD algorithms
  • Motivated PhishChain’s PageRank-based truth discovery approach

Baseline algorithms benchmarked:

  • EM (Expectation Maximization): 93.71% accuracy on PhishTank 2020
  • GLAD: 93.98% accuracy on PhishTank 2020
  • PhishChain PR-based: 95.45% accuracy (outperforms both)

Survey coverage:

  • Truth discovery algorithms dla conflicting crowd-sourced data
  • Inferring ground truth from assessments with varying expertise
  • Database community approaches to data quality

Dodaj PDF aby wygenerować pełne podsumowanie używając /summarize-paper li-truth-discovery-survey-2016

Elementów w folderze: 0.