The publishing ecosystem is increasingly transitioning from the physical environment of libraries to digital space. However, enabling precise online discovery of content is vital for information-centric organizations. Further, with the explosion in research output transcending the boundaries of fixed domains to generate more inter-disciplinary content, researchers need content discoverability to be more intelligent.
Semantic metadata enables content intelligence by extracting domain-specific Entities and Concepts from content and relating them in a meaningful way to identify related content and facilitate intelligent answers to user queries.
Semantic enrichment services from Scope aims to enhance content/data by adding contextual information by tagging, categorizing, and/or classifying data in relationship to each other and other base reference sources. Our semantic enrichment enables finding more relevant information, receive deep insight and provide decision-making support.
Scope’s Semantic enrichment architecture consists of the following components:
Domain Knowledge Manager:
It helps in the creation and maintenance of domain-specific control vocabularies (CVs) such as taxonomies, thesauri and ontologies. Based on a client’s requirements, Scope can process existing CVs or RDF, or build new CVs by extracting keywords from source documents and classifying them into hierarchical structures. CVs can also be built by adopting an open source CV and further updating it with the keywords extracted from the source documents using NLP and statistical techniques.
Semantic Annotation Manager:
The Semantic Annotation Manager extracts and tags Named Entities and Concepts from source documents using NLP and statistical algorithms and also based on the CVs. The relationships between concepts are built using Ontology terms and NLP techniques. Triplets (Subject/Predicate/Object) extracted from each document are stored into a RDF store, which is referred during the annotation process to extract the similar terms and relationships when further documents are passed through Semantic Annotation Manger.
The integration manager helps to generate output in industry-acceptable standards such as RDF XML, RDF store (N3 format), SKOS and OWL formats. This also facilitates seamless integration with the content management systems.
Scope’s Semantic Processing – Value Chain
Semantic Enrichment Offerings from Scope
Tagging XML documents with conceptual tags using domain specific mark up languages, such as CML (Chemical Markup Language), MatML (Material Markup Language), GML (Geographic Markup Language) etc.
Annotating terms in abstracts with definitions using technical glossaries and dictionaries
Indexing documents with domain-specific concepts using controlled vocabularies such as taxonomy and thesauri to provide the domain context to the terms
Identifying the semantic relationships across concepts and using such relationships to link content within the legacy content of the publishers and also from external open access databases
Authoring abstracts that could extract and explain concepts described in a document, and delivering such abstracts in a structured format with pre-defined concept labels