Time Series Variation Based on Power Links and Field Association Words in English and Arabic
Some words have frequency appearance in each text and are known as keywords because they have a close association with the subjects of their texts, these words change frequencies in a given duration with time series variation. However, the significance of frequency shift with time-series variance is not considered in conventional text dealing methods and text search techniques. Traditional methods were therefore unable to correctly determine the popularity index of the word in a given time. A new method is proposed to automatically estimate the stability groups (increasing, relatively constant, and decreasing) that indicate the popularity of the word with variations in time series based on the change in frequency in previous text data. At first, learning data was created by defining five attributes to quantitatively measure the frequency of word shift, automatically extracting these five attributes from electronic texts. These learning data were categorised manually (human) into three classes of stability. These data were then subjected to a decision tree (DT) for automatic determination of the stability groups of research data (test data). However, the accuracy of the decision tree estimate decreases as data numbers are distributed between groups. A new way to use a Random Sampling Approach and a new Data Copying Method to increase the accuracy of decision tree estimation. In addition, the term Field Association (FA) is the smallest word or conceptual unit capable of evaluating a text field in a scheme called field tree. In a given period of time, a modern methodology based on frequencies of particular terms related to the field called Field Association (FA) terms is used. The technique suggests that the stabilisation classes of FA terms are automatically evaluated to point out the popularity of the list of FA terms depending on the time shift and to increase the accuracy of the Decision Tree (DT). In addition, all previous methods are based on keywords in the languages of English and French. The extension of keywords to other languages, such as Arabic, could therefore reinforce further research in that area. This approach incorporates a new method of extracting Arabic keywords from companies based on changes in a text over several periods of time using a decision tree based on their recurrences. The new methodology is applied to the new field of data set (computer science) that differentiates it from historically used approaches. Finally, to examine the co-occurrences of words in the publications of a given topic, Power Links Analysis (PLA) was developed to solve the previous downside of conventional approaches. This approach is focused on both the advanced frequency form and the distance between the various instances of the terms given. A new technique has been proposed that extracts stability groups of field association terms based on automated power relation analysis to improve decision tree accuracy. In this method, we analysed the effects of time variance based on the frequencies of unique FA words that applied to documents in a particular period using power connection.
Department of Computer Science, Faculty of Science, Tanta University, Tanta, Egypt and College of Computer Science and Engineering at Yanbu, Taibah University, Saudi Arabia.
View Book :- https://bp.bookpi.org/index.php/bpi/catalog/book/312