See Main Page for details and downloads.
Train on Train-Small and predict on Test.
Train on Train-Small and predict on Nanni-Test.
Train on Train-Small and predict on Nanni’s 201.
5-fold cross-validation on Nanni’s 201. (Original evaluation protocol of Nanni et al. (2018))
Train on Train-Remaining and predict on Test.
Train on Train-Remaining and predict on Nanni-Test.
Train on Train-Remaining and predict on Nanni’s 201.
We report results separately for features derived from sentence and paragraph contexts.
Evaluation results using train-small
and nanni-200
. Significance is analyzed with a standard error overlap test: ▾ below standard error, ▴ above standard error.
Paragraph Context | Sentence Context | ||||||||
Small/Test | P@1 | MAP | ndcg@20 | P@1 | MAP | ndcg@20 | |||
Rank-lips | 0.582±0.007 | 0.746±0.004 | 0.810±0.003 | 0.623±0.007 | 0.771±0.004 | 0.828±0.003 | |||
RankLib | 0.576±0.007 | 0.740±0.004 | 0.804±0.003 | 0.614±0.007 | 0.765±0.004 | 0.824±0.003 | |||
Small/Nanni-Test | |||||||||
Rank-lips | 0.601±0.004▾ | 0.755±0.002▾ | 0.816±0.002▾ | 0.664±0.003▴ | 0.802±0.002▴ | 0.851±0.002▴ | |||
RankLib | 0.594±0.004▴ | 0.751±0.002▴ | 0.813±0.002▴ | 0.668±0.003▾ | 0.806±0.002▾ | 0.855±0.002▾ | |||
Small/Nanni’s 201 | |||||||||
Rank-lips | 0.617±0.034▴ | 0.762±0.022▴ | 0.821±0.017▴ | 0.657±0.033 | 0.784±0.022 | 0.836±0.017 | |||
RankLib | 0.632±0.034▾ | 0.779±0.021▾ | 0.835±0.015▾ | 0.677±0.033 | 0.796±0.021 | 0.845±0.016 | |||
Nanni’s 201-CV | |||||||||
Rank-lips | 0.647±0.034 | 0.780±0.022 | 0.835±0.017 | 0.667±0.033 | 0.785±0.022 | 0.837±0.017 | |||
RankLib | 0.602±0.034 | 0.747±0.022 | 0.817±0.017 | 0.612±0.034 | 0.765±0.022 | 0.824±0.016 | |||
Nanni et al | 0.637±0.034 | 0.777±0.021 | 0.833±0.016 | 0.667±0.034 | 0.790±0.022 | 0.842±0.016 | |||
Remaining/Test | |||||||||
Rank-lips | 0.587±0.006 | 0.751±0.004 | 0.813±0.003 | 0.628±0.006 | 0.774±0.004 | 0.831±0.003 | |||
Remaining/Nanni-Test | |||||||||
Rank-lips | 0.604±0.004 | 0.758±0.002 | 0.818±0.002 | 0.697±0.003 | 0.822±0.002 | 0.867±0.002 | |||
Remaining/Nanni’s 201 | |||||||||
Rank-lips | 0.626±0.034 | 0.771±0.022 | 0.828±0.016 | 0.682±0.033 | 0.797±0.022 | 0.846±0.017 |
If you obtain new results on this dataset, we want to hear about your results and would be honored to include it in the table below.
The baseline uses list-wise learning to rank to combine the following features.
All features are based on word/entity similarities between context and (parts of) an aspect.
The following similarities are used. We exclude Nanni’s RDF2Vec feature since it is difficult to produce and does not perform well.
using context as query and aspect part as document, use BM25 with default parameters as a ranking model.1
cosine tf-idf score between context and aspect part. We use tf-idf variant with tf log normalization and smoothed inverse document frequency.
number of unique words/entities shared between context and aspect part (no normalization).
Word embedding similarity between context and aspect part. Word vectors are weighted by their TF-IDF weight. The pretrained word embeddings were taken from word2vec-slim, a reduced version of Google News word2vec model.2
Features combinations:
context | aspect part | BM25 | TFIDF | Overlap | W2Vec | |
---|---|---|---|---|---|---|
sentence words | name words | X | X | X | ||
paragraph words | name words | X | X | X | ||
sentence words | content words | X | X | X | X | |
paragraph words | content words | X | X | X | X | |
sentence entities | content entities | X | X | X | ||
paragraph entities | content entities | X | X | X |
Entity-aspect-linking-2020
by Jordan Ramsdell, Laura Dietz
is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at http://trec-car.cs.unh.edu/datareleases/v2.4-release.html,
work at www.wikipedia.org,
and on a work at https://federiconanni.com/entity-aspect-linking/.
We provide corpus statistics in our dataset.↩︎
Available at https://github.com/eyaler/word2vec-slim↩︎