Login

 

Show simple item record

dc.contributor.advisor Hadaegh, Ahmad Reza en_US
dc.contributor.author Tang, Jiali
dc.date.accessioned 2019-12-03T00:00:18Z
dc.date.issued 2019-12-02
dc.date.submitted 2019-12-02
dc.identifier.uri http://hdl.handle.net/10211.3/214328
dc.description.abstract The exponentially increasing amounts of data associated with drug discovery being generated each year make getting useful information from that data more and more critical. With a central repository to keep the massive amounts of data, organizations need tools that can help them extract the most useful information from the data. A data warehouse can bring together data in a single format, supplemented by metadata through the use of a set of input mechanisms known as extraction, transformation, and loading (ETL) tools. Extraction of the data can be either extracting existing data or the data that is imported to the database, transformation is when the data is translated to the format the database can understand. Transformation makes the new format of the data consistent with the other existing data. Finally, the formatted data can be loaded into files and the link address of the data is saved in tables in the database for further analysis. Analysis of the data includes simple query and reporting, statistical analysis, complex multidimensional analysis, and data mining. Large quantities of data are searched and analyzed to discover useful patterns or relationships, which are then used to predict behavior. The purpose of this project is to produce a repository database of drugs, drug features (properties), and drug targets where data can be mined and analyzed. Drug targets are different proteins that drugs try to bind to stop the activities of the protein. For example, -secretase is a protein that causes Alzheimer’s. There are certain drugs that can bind to -secretase to stop its functionality which in turn may stop Alzheimer’s disease. Users can utilize the database to mine useful data to predict the specific chemical properties that will have the relative efficacy of a specific target and the coefficient for each chemical property. This database can be equipped with different data mining approaches/algorithms such as linear, non-linear, and classification types of data modeling. The data models have enhanced with the Genetic Evolution (GE) algorithms [1, 2, through 17]. This paper discusses implementation with the linear data models such as Multiple Linear Regression (MLR) [18], Partial Least Square Regression (PLSR) [19], and Support Vector Machine (SVM) [20]. en_US
dc.description.sponsorship Computer Science en_US
dc.language.iso en_US en_US
dc.subject Data Mining en_US
dc.subject Database System en_US
dc.subject Drug Discovery en_US
dc.title A Repository Database System to do Data Mining in Drug Discovery en_US
dc.description.embargoterms 3 years en_US
dc.date.embargountil 2022-12-02T00:00:18Z
dc.genre Project en_US
dc.contributor.committeemember Ye, Xin en_US


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record

Search DSpace


My Account

RSS Feeds