View on GitHub

Semantic Annotation for ACM Categories

Given a research paper,we annotate the paper with a Wikipedia article that matches its contents most closely.

Download this project as a .zip file Download this project as a tar.gz file

Project Description

The aim of this project is to annotate a document with a label with semantic analysis. We are testing our algorithm on ACM dataset with the help of wikipedia pages of the categories.

Methodology

The following is the process that we have followed.

Source Code

ACM

run.py

This code is responsible for representing the dataset into vector space model and the models are saved.

pre_process.py

This code is responsible to make the dataset clean.

classify.py

This code is responsible for training the classifier and also reporting the results of cross-validation.

Analysis

We present our analysis using pictorial representation.

MAP value (with Stemming): MAP Comparison

NDCG value (with Stemming): NDCG Comparison

MAP value (with Stemming): MAP Comparison

Tags

Link to resources

Presentation

Video

Report

Contributors