Ping An Insurance (Group) Company of China has filed a patent for a text classification method that utilizes artificial intelligence. The method involves preprocessing text data, matching tags to the text vector, inputting the tagged text vector into a BERT model, training the untagged text vector with a convolution neural network model, and using a random forest model for multi-tag classification. The patent aims to achieve accurate and efficient text classification. GlobalData’s report on Ping An Insurance (Group) Company of China gives a 360-degree view of the company including its patenting strategy. Buy the report here.

According to GlobalData’s company profile on Ping An Insurance (Group) Company of China, digital lending was a key innovation area identified from patents. Ping An Insurance (Group) Company of China's grant share as of June 2023 was 1%. Grant share is based on the ratio of number of grants to total number of patents.

Text classification method using artificial intelligence for accurate classification

Source: United States Patent and Trademark Office(USPTO). Credit: Ping An Insurance (Group) Company of China Ltd

A recently filed patent (Publication Number: US20230195773A1) describes a text classification method and apparatus. The method involves several steps to classify text data into different categories.

The first claim outlines the method, which includes preprocessing the original text data to obtain a text vector. This involves segmenting the text data, removing stopwords, deduplicating the data, and vectorizing it. The method then matches a tag to the text vector, resulting in a tagged text vector and an untagged text vector. The tagged text vector is inputted into a BERT (Bidirectional Encoder Representations from Transformers) model to obtain a word vector feature. The untagged text vector is trained with a convolution neural network model using the word vector feature to obtain a virtually tagged text vector. Finally, a random forest model is used to perform multi-tag classification on the tagged text vector and the virtually tagged text vector, resulting in a text classification result.

The subsequent claims provide additional details and variations of the method. Claim 3 specifies that the BERT model consists of an input layer, a vector layer, a classification layer, and a coding layer. Claim 7 describes the generation of the random forest model, which involves extracting sample subsets from the tagged and virtually tagged text vectors and training decision tree models using a bagging algorithm. These decision tree models are then combined to form the random forest model.

The patent also includes claims for a text classification apparatus and a computer-readable storage medium containing the text classification program. These claims mirror the steps outlined in the method claims.

In summary, the patent describes a text classification method and apparatus that involve preprocessing text data, using BERT and convolution neural network models, and employing a random forest model for classification. The method claims provide specific details about the steps involved, while the apparatus claims cover the hardware implementation. The patent aims to improve the accuracy and efficiency of text classification tasks.

To know more about GlobalData’s detailed insights on Ping An Insurance (Group) Company of China, buy the report here.

Premium Insights


The gold standard of business intelligence.

Blending expert knowledge with cutting-edge technology, GlobalData’s unrivalled proprietary data will enable you to decode what’s happening in your market. You can make better informed decisions and gain a future-proof advantage over your competitors.


GlobalData, the leading provider of industry intelligence, provided the underlying data, research, and analysis used to produce this article.

GlobalData Patent Analytics tracks bibliographic data, legal events data, point in time patent ownerships, and backward and forward citations from global patenting offices. Textual analysis and official patent classifications are used to group patents into key thematic areas and link them to specific companies across the world’s largest industries.