The publishing ecosystem is increasingly transitioning from the physical environment of libraries to digital space. However, enabling precise online discovery of content is vital for information-centric organizations. Further, with the explosion in research output transcending the boundaries of fixed domains to generate more inter-disciplinary content, researchers need content discoverability to be more intelligent.
Semantic metadata enables content intelligence by extracting domain-specific Entities and Concepts from content and relating them in a meaningful way to identify related content and facilitate intelligent answers to user queries.
Semantic enrichment services from Scope aim to enhance content/data by adding contextual information by tagging, categorizing, and/or classifying data in relationship to each other and other base reference sources. Our semantic enrichment enables finding more relevant information, receive deep insight and provide decision-making support.
Scope’s Semantic Enrichment architecture consists of the following components:
Domain Knowledge Manager:
It helps in the creation and maintenance of domain-specific controlled vocabularies (CVs) such as taxonomies, thesauri and ontologies. Based on client’s requirements, Scope can process existing CVs or build new CVs by extracting keywords from source documents and classifying them into hierarchical structures. CVs can also be built by adopting an open source CV and further updating it with the keywords extracted from the source documents.
Semantic Annotation Manager:
The Semantic Annotation Manager extracts and tags Named Entities and Concepts from source documents using NLP and statistical algorithms and also based on the CVs. The relationships between concepts are built using Ontologies and NLP techniques. Triplets (Subject/Predicate/Object) extracted from each document are stored into a RDF store, which is referred during the annotation process to extract the similar terms and relationships when further documents are passed through Semantic Annotation Manger.
The integration manager helps to generate output in industry-acceptable standards such as RDF XML, RDF store (N3 format), SKOS and OWL formats. This also facilitates seamless integration with the content management systems of the clients.