Resources

CNP : A Corpus of annotated Claims found in NLP Papers

A corpus containing 15M+ sentences extracted from 100k+ NLP papers (ACL Anthology, ArXiv), enriched with :

As well as :

The main data files of the corpus are available on HuggingFace, and complementary code and materials can be found on GitHub. To learn more about the corpus, the analyses that were conducted on it, and the motivation behind its creation, see Analysing Claims in NLP Research : A NLP4NLP Approach (M2 thesis).