Overcoming challenges posed by imbalanced data to AI technologies
In recent years, artificial intelligence (AI) technology has become ubiquitous, and it is making its presence felt in a variety of fields, including healthcare, marketing, and finance. Furthermore, AI-based decision-making is expected to impact operations and automation tasks. Despite these expectations, there are challenges that act as roadblocks to realize the full potential of AI technologies—lack of data and imbalanced data. Bearing these challenges in mind, Fujitsu Laboratories has now developed Wide Learning. It is a machine learning technology that is capable of making highly accurate judgments, even in cases where there is imbalanced data.
According to Fujitsu, Wide Learning technology has the capability to create combinations of data items to extract large volumes of hypotheses. In addition, the technology can adjust the degree of impact of knowledge chunks to build an accurate classification model. For instance, this technology treats all combination patterns of data items as hypotheses.
Subsequently, it determines the degree of importance of each hypothesis on the basis of certain levels of hit rate for the label category. Even when the target data is insufficient, the system can extract all hypotheses worth checking out. These hypotheses may in equal measure contribute to the discovery of previously unconsidered explanations.
Wide Learning overcomes the challenges posed by imbalanced data by building a classification model based on multiple extracted knowledge chunks and on the target label. In this process, the system controls the degree of impact if the items making up a knowledge chunk overlap with the items making up other knowledge chunks. Thereby, it reduces the weight of their influence on the classification model. In this fashion, Wide Learning can train a model capable of accurate classifications even when there is imbalance in the target label or the data marked as correct.
Click here to learn more about the AI technology that facilitates highly precise learning even from imbalanced data.