Skip to content

Project: dictionary based search

aishdharan edited this page Dec 7, 2020 · 4 revisions

supervised search using dictionaries

project lead

Aishwarya

technical lead

Ayush

members

Anugrah Vanisha Vaishali

Requirements

sectioning

Scenarios

  • articles with explicit named sections (JATS - XML language for scientific documents). Top Priority Start with JATS articles from EPMC. (PMR has a JATS2HTML converter). 250 tags. options . "contributor" rather than "author". "floats-group" for figures.

  • articles with explicit unnamed sections (XML).

  • articles in HTML

  • convert JATS-XML to HTML for human readers - customisation, editing, etc.

  • articles without explicit sections (e.g. from PDF or early XML). Hard (may need ML)

Abstract

Introduction

Conclusions

Figure captions

Acknowledgments

Tasks

  • To identify article category (we're focussing on articles in JATS - XML language currently) of each article. -> Find sections(Abstract, Introduction, Materials and Methods, Results, Discussion, etc.) that are there in one particular type of article which is/is not there in other types of articles
  • Create a statistical graph for types of scientific articles (Experiment-based, Review, or Editorial) that people mostly read.