Most publishers have author details as metadata in documents only and have no easy to refer database of authors and their profiles. This scenario has made many publishers realize the need for a comprehensive author data management solution by extracting such information from the documents, standardizing and disambiguating the author names to create a database of unique authors and their profiles.
With over two decades of experience in information processing, Scope offers niche author database development, data standardization, data validation/updating, data normalization, data visualization and data clean up services. Scope, over the years, has executed several key assignments involving the creation of author name and affiliation databases, through extracting and parsing author data from source documents, for major global clients. In total, Scope has created author records from more than 7 million articles, comprising 30 million author records.
Scope’s Author Data Services employs a team of certified data quality analysts to evaluate the quality of the author records sourced from different publishers or institutions. The data analysts determine the extent of missing data and duplicate data, check the quality of parsing of the author name and affiliation data, and ensure the output is devoid of spelling errors and standardization issues with author names and affiliation data.
Through direct experience, Scope has tackled key challenges regarding author data clean up and enhancement, including unstructured data inputs, a variety of naming conventions, data variants, and OCR induced-errors. Scope has the experience in rapid processing of voluminous author records through its automated workflow tool, but ensuring high level of accuracy through multiple layers of quality control at each stage. Scope utilizes various QC tools, including a proprietary Global Check tool, as well as a variety of auto correction and semi-manual detection and correction tools.
Leveraging this experience and expertise gained over the year, Scope has developed its author data enhancement solution, AuthEntik™, which uses a combination of automated algorithms, manual validation and standardized data repositories to develop fast, scalable and high-quality databases of disambiguated author names their affiliations, publications and other related data.. Using this database, AuthEntik can also provide clustering and visualization of relationships across authors. The platform-enabled solution substantially reduces the need for manual validation of author data yet achieves a high level of author database accuracy. It further enhances productivity of author metadata processing by setting up readily deployable modules for each task, which can be customized for different requirements.