r37980778c78--feed0e461649512c77b5927a26619c63

This folder contains R code for a rule-based Buddhist Sanskrit Segmenter and Lemmatiser, as well as data necessary to use and evaluate the Segmenter and explanatory materials. The segmenter has been tested on 639 sentences from 13 Buddhist text (9 sūtras, 4 śāstra) and has been evaluated as achieving 97% accuracy. The code and materials contained in this folder have been developed as part of a Newton International Fellowship at King's College London, funded by the British Academy (NF161436)   Contents R code for segmentation, lemmatisation and evaluation (includes instructions to run code) powerpoint presentation with background and explanation of project Wordlists and Wordlists documentation ngrams and stems frequency tables necessary for segmentation gold standard set of manually segmented and stemmed sentences for evaluation set of raw sentences for evaluation evaluation of Krisha et al. seq2seq segmenter on Buddhist sentences for reference purposes   This segmenter has been used to prepare the Sanskrit Corpus at DOI 10.5281/zenodo.3457822

Tags
Data and Resources
To access the resources you must log in

This item has no data

Identity

Description: The Identity category includes attributes that support the identification of the resource.

Field Value
PID https://www.doi.org/10.5281/zenodo.3459219
URL http://dx.doi.org/10.5281/zenodo.3459219
URL https://figshare.com/articles/Buddhist_Sanskrit_Segmenter/11631978
Access Modality

Description: The Access Modality category includes attributes that report the modality of exploitation of the resource.

Field Value
Access Right Open Access
Publishing

Description: Attributes about the publishing venue (e.g. journal) and deposit location (e.g. repository)

Field Value
Collected From figshare
Hosted By figshare
Publication Date 2019-09-24
Additional Info
Field Value
Language UNKNOWN
Resource Type Software
system:type software
Management Info
Field Value
Source https://science-innovation-policy.openaire.eu/search/software?softwareId=r37980778c78::feed0e461649512c77b5927a26619c63
Author jsonws_user
Last Updated 17 December 2020, 22:05 (CET)
Created 17 December 2020, 22:05 (CET)