This is a solution of Project Risk Management part 2 in which we discuss Developing business strategy can help your company cope with aging systems and limited resources that can lead to fragmented IT 

Project Risk Management part 2

Text mining for managing project documents

Project Risk Management part 2

In construction projects, a high percentage of project information is exchanged using text documents, including contracts, change orders, field reports, requests for information, and meeting minutes, among many others [6]. Management of these documents in model-based information systems, such as Industry Foundation Classes (IFC) based building information models, is a challenging task due to difficulties in establishing relations between such documents and project model objects. Manually building the desired connections is impractical since these information systems typically store thousands of text documents and hundreds of project model objects. Current technologies used for project document management, such as project websites, document management systems, and project contract management systems do not provide direct support for this integration. There are some critical issues involved, most of them due to the large number of documents and project model objects and differences in vocabulary. Search engines based on term match are available [15] in many information systems used in construction. However, the use of these tools also has some limitations in cases where multiple words share the same meaning, where words have multiple meanings, and where relevant documents do not contain the user-defined search terms [6] heterogeneous data representations including text documents [16]; various data analysis tools were also applied on text data to create thesauri, extract hierarchical concepts,and group similar files for reusing past design information and construction knowledge [17]. In one of the previous studies conducted by our research group, a framework was devised to explore the linguistic features of text documents in order to automatically classify, rank, and associate them with objects in project models [13]. This framework involved several methods as discussed in our vision for data analysis on non-traditional construction data sources: – Special Data Preparation operations for text documents were identified, such as transferring text-based information into flat text files from their original formats, including word processors, spreadsheets, emails, and PDF files; removing irrelevant tags and punctuations in original documents; removing stop words that are too frequently used to carry useful information for text analysis like articles, conjunctions, pronouns, and prepositions; and finally, performing word stemming to remove of prefixes and/or suffixes and group words that have the same conceptual meanings. – The preprocessed text data were transformed into a specific Data Representation using a weighted frequencymatrix A = {aij}, where aij was defined as the weightof a word i in document j. Various weighting functionswere investigated based on two empirical observations regarding text documents: (1) the more frequent a word is in a document, the more relevant it is to topic of the document; and (2) the more frequent a word is throughout all documents in the collection, the more poorly it differentiates between documents. By selecting and applying appropriate weighting functions, project documents were represented as vectors in a multi-dimensional space. Query vectors could then be constructed to identify similar documents based on similarity measures such as Euclidian distance and the cosine between vectors. – Data Analysis tools were applied to develop a logical framework for integrating text documents in model-based information systems. The main goal of this integration framework is to improve the identification and analysis of relevant project documents. The large number of objects in a project model and of text documents in construction projects makes the proposed automated integration framework desirable. It generates significant savings in the time and effort required to link all objects contained in a project model to all relevant documents generated during the project’s life cycle. [15]


Text Mining and Automated Document Classification.Text mining is increasingly being used to denote all the tasks that, by analyzing large quantities of text documents, try to extract possibly useful information. Results of the text mining process in a collection of documents stored in interorganizational systems can be used to improve information management in such systems and also to generate knowledge about the subjects contained in these documents. According to this view, text classification~or categorization! is an instance of text mining ~Sebastiani 1999!. In this project, the classes are represented by construction project components. Hence, construction document classification was defined as the task of assigning a Boolean value to each pair$dj ,oi%PD3O, where D is a domain of documents and O5$ol ,…,on% is a set of project components ~classes!. A value of T ~true! assigned to C(dj ,oi) indicates a decision that document dj is related to the component oi , while a value of F ~false! indicates that dj is not related to the component oi . The document classification task can is binary classification, in which a classification decision for each document is made independently on a class-by-class basis. In binary classification, each document is classified as relevant or not to each of the existing classes. From a theoretical point of view, the binary case is more general than the multilabel case, in the sense that an algorithm for binary classification can also be used for multilabel classification. In order to do this, the problem of multilabel classification under objects can be transformed into n independent problems of binary classification where n is the number of classes. Through the use of machine learning algorithms, a general inductive process automatically builds a classifier ~classification model! for each class by observing the characteristics of a set of documents that have previously been classified manually by a domain expert. The classification problem is an activity of supervised learning, since the learning process is driven by the previous knowledge of the categories in some of the documents that will be used to build the model. Hence, this approach relies on the existence of an initial corpus of documents previously classified according to their relevance to a set of project components. A document dj is called a positive example of oi if C(dj ,oi)5T,and a negative example of oi if C(dj ,oi)5F. After generating the classification model it is important to evaluate its effectiveness. One alternative for this evaluation is to split the initial collection of documents into two sets:
• Training set: set of documents that will be used to create the classification model; and
• Test set: set of documents that will be used for testing the effectiveness of the classifier.
The documents in the test set cannot participate in the inductive construction of the classifier; if this condition is not satisfied, then the experimental results obtained would probably be unrealistically good. The definition of the size of the training set is also crucial to avoid overfitting. This happens when the classifier performs with few errors on the training set and does not generalize to the new test cases.

Cheap Assignment Help UK provide assignment writing service based on case study requirements in Project Risk Management are providing most flexible online assignment writing help, so book your Assignment with us, order now  

Maddox Smith

Greetings for the day !
Hope that you’re well !

We want to introduce ourselves as a team of professionals who are into academic writing for the last 10+ years. We can provide assignment assistance in all subjects. Our experts can provide solutions across all the topics right from Management, HR, Marketing, Finance & Accounts, Statistics, IT, childcare, nursing, law, and general writing. We provide plagiarism free work and also send a ‘Turnitin’ report along with completed work. Our services are available at reasonable cost; we entertain amendment requests from clients without any extra charges.

Our Feature Included

Ø Every assignment includes graphical representation like pie chart, bar graph, smart art and all.
Ø Free 0% plagiarism report
Ø Expert team for technical work as well.
Ø On time delivery
Ø Multiple rework facility
Ø Huge team of expert in each subject
Ø Referencing like: Harvard, APA, MLA, Oscola, automatic referencing all are familiar to our experts.

Subject we cover: Math , finance, economics, accounts, civil engineering, mechanical engineering, IT, Computer science, electrical and electronics engineering, history, geography, political science, sociology, physiology, philosophy, biology, microbiology, biotechnology, biotechnology, B-school assignments, project report, psychology, nursing assignments, medical assignments, Tourists and travelling assignments all kinds of dissertation and so on

Best Regards:
Oz Paper Help
WhatsApp:+1 585-666-2225
1 Step 1