Designing And Mining A Blood-Bank Management Database System

3y ago
26 Views
2 Downloads
1.02 MB
127 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Shaun Edmunds
Transcription

Designing and Mining a Blood-BankManagement Database SystemA Thesis Presentedto the University of theAhsanullah University Of Science And TechnologyIn Partial Fullfilmentof the Requirements for the Degree ofBachelor of Science inComputer Science and EngineeringByDeepa DasFareal AhmedSyeda Shabnam HasanJannatul Ferdous: 07.01.04.006: 07.01.04.024: 07.01.04.030: 07.01.04.047March 2011Thesis supervisor : Ms. Rosina Surovi Khan

AcknowledgmentThe enduring pages of the work are the cumulative sequence of extensive guidance andarduous work. We wish to acknowledge and express our gratitude to all those withoutwhom our thesis could not have been a reality. We feel very delighted to get this rareopportunity to show our profound senses of reverence and indebtedness to our thesissupervisor Ms. Rosina Surovi Khan for the information she provided to us through thelecture sittings and her invaluable timely advice and guidance. We would like to extendour sincere thanks to her for giving us her precious time and for being always available tous in order to clarify our doubts regarding the thesis. This thesis is dedicated to ourparents who have given us the opportunities of education from the best institutions andsupport throughout our lives. The last but not the least we would like to thank all thosewho have directly or indirectly helped and cooperated in accomplishing this thesis.i

AbstractA database is the single most useful environment in which to store data and an ideal toolto manage and manipulate that data. The benefits of a well-structured database areinfinite, with increased efficiency and time-saving benefits. Our team’s interest iscentered around this area. At the very start, we create a database on blood-bankmanagement system. We use Microsoft SQL Server for this purpose. We determineattributes and entities and figure out relationships among entities. Then we draw theentity-relationship diagram, convert it to a relational model (relational tables) andnormalize the tables. We implement the design, create tables and insert values inside thetables using sql server. We execute sample queries on the system and verify that oursystem contains all required information making retrieval of the information fast andefficient. In part II of the thesis, we convert the database tables of the system to text files.Using exact and approximate string matching algorithms, we match a string in questionwith the strings in text files and get the index of exactly matched strings for the formerand obtain approximately matched strings displaying edit distances between the two forthe latter.ii

CONTENTSPart-I1Introduction1.11.21.31.42What is a database?Advantages of DatabaseDisadvantages of DatabaseComponents of Database DesignDatabase Design2.12.22.32.42.5The Entity-Relationship ModelRelational Schemas2ormalizationTables with sample values after 2ormalizationQueries11122338111518Part-II3Text File Exportation3.13.2Exporting Text Files From Database TablesElementary program in C to read stringsof text files3.2.13.2.23.33.44ProcedureThe codeConverted Text FilesSelecting a text file for miningExact String Matching Algorithms4.1Research on Exact String Matching Algorithms4.1.14.1.24.1.34.1.44.1.5Brute Force AlgorithmMorris-Pratt AlgorithmKnuth-Morris-Pratt AlgorithmBoyer-Moore AlgorithmAnalysis2121252525293541414145495357iii

5Approximate String matching algorithms5.1Algorithm on finding Edit Distance betweentwo strings5.1.15.1.25.1.35.2Introduction to Approximate String MatchingAlgorithms5.2.15.2.25.2.35.2.46Definition of String Edit DistanceAlgorithmDry RunBrute Force AlgorithmLipschitz Embeddings AlgorithmBall Partitioning AlgorithmAnalysisConclusion and Future work6.16.2ConclusionFuture ix747476767780828.1 Code to measure Edit Distance8.2 Implementation of Exact String Matching Algorithms8.2.18.2.28.2.38.2.4Brute Force AlgorithmBoyer Moore AlgorithmKnuth Morris Pratt AlgorithmMorris Pratt Algorithm8.3 Implementation of Approximate String MatchingAlgorithms8.3.1 Brute Force8.3.2 Lipschitz Embeddings8.3.3 Ball Partitioning Method84848691iv

Part-IChapter 1Introduction1.1 What is a database?A database is a collection of organized interrelated data. Traditionally the data will bepresented something like this:firstname surname DobJohnSmith 01/12/76SaraJones 13/06/69FredBloggs 11/11/73Tables in a database are used for storing specific collections of data.1.2 Advantages of Database It means all of the information is together.The information can be portable if on a laptop.The information is easy to access at any time.It is easily retrievable.Many people can access the same database at the same time.Improved data security.Reduced data redundancy.Reduced updating errors and increased consistency.Greater data integrity and independence from applications programs.Improved data access to users through use of host and query languages.Reduced data entry, storage, and retrieval costs. Facilitated development of new applications program.1

1.3 Disadvantages of Database Database systems are complex, difficult, and time-consuming to design.Initial training required for all programmers and users.Suitable hardware and software start-up costs.A longer running time for individual applications.Damage to database affects virtually all applications programs.Extensive conversion costs in moving from a file-based system to a databasesystem.1.4 Components of Database Design Entity relationship modelRelational Model (Relational tables)Normalization of tablesImplementation in SQL serverUsage of the system (Execution of sample complex queries)2

Chapter 2Database Design2.1 The Entity-Relationship ModelThe entity-relationship (E-R) model was developed to facilitate database design byallowing specification of an enterprise schema that represents the overall logicalstructure of a database. The E-R data model is one of the several semantics datamodels; the semantic aspect of the model lies in its representation of the meaningof the data. The E-R model is very useful in mapping the meanings andinteractions of real-world enterprises onto conceptual schema. The E-R data modelhas three basic notions: entity-sets, relationship sets and attributes.Entity sets: An entity is a thing or object in the real world that is distinguishablefrom all other objects. It has a set of properties, and the values for some set ofproperties may uniquely identify an entity. It is also a set of entities of the sametype that share the same properties.Relationship sets: A relationship is an association among several entities. It is aset of relationships of the same type. The association between entity sets is referredto as participation. The function that an entity plays in a relationship is called thatentity’s role.Attributes: An entity is represented by a set of attributes. Attributes aredescriptive properties possessed by each member of an entity set. The designationof an attribute for an entity set expresses that the database stores similarinformation concerning each entity in the entity set; however, each entity may haveits own value for each attribute. [1]3

Some important features of E-R model:Mapping Cardinality: Mapping cardinalities express the number of entities towhich another entity can be associated via a relationship set. Cardinality can be-- One-to-one: an entity of a set can beassociated with at most one entity ofanother. One-to-many: an entity of a set isassociated with any number (entities)of another set. Many-to-one: an entity (1st set) isassociated with at most one entity(of 2nd set). But 2nd set’s entitycan associate with any number of1st entity set. Many-to-many: Entities of bothsets can be associated with anynumber of entities between them.E-R diagram: It can express the overall logical structure of a database graphically.A diagram consists of some major components—# Rectangles: represent entity set.# Ellipses: represent attributes.# Diamonds: represent relationships.# Lines: which link attributes toentity sets and entity sets torelationship sets. [1]4

hb grpsexhNamedrecog idHospitalhIdDisease Recognizer1mhb qntyrdrecog nameverifybelongs tostatusodis name1sample noBlood SampleDistrictdis idb group1s1nstays inreside inprocessessamplerNameAgedIdb qntyrIdqDonorBlood RecipientSexdreg datedNamesubmit orders toagexrequestsdb grprb grpr regdateptou11registers1rs idrecordsRegistration Staff1Blood Processing Managerbm idsexsex1bm namers nameFigure 2.1.1 : ER diagram of Blood Bank Management System5

Our E-R diagram represents the Blood-Bank Management system. It has eight entity sets.They are—a) Donor: (Attributes- dName, dId, sex, age, dreg date, db grp).b) District: (Attributes- dis id, dis name).c) Registration Staff: (Attributes- rs id, rs name, sex).d) Blood Recipient: (Attributes- rId, sex, age, r regdate, rName, b qnty, rb grp).e) Blood Sample: (Attributes- b group, sample no, status).f) Disease Recognizer: (Attributes- drecog id, drecog name, sex).g) Blood Processing Manager: (Attributes-bm id, bm name, sex).h) Hospital: (Attributes- hId, hName, hb grp, hb qnty).Abbreviations of all attributes are given in relational schema.Some notes about entity sets, their attributes and cardinalities among them---Donor- Who donates blood. When a donor will donate, an id(a serial number willbe given for a specific identification (primary key)); age, sex, name, registrationdate (dreg date) and blood group will be stored in the database under entity Donor.District- Every district’s/location’s id is different (primary key).Registration Staff- Registration staffs will register the information of donors andthe recipients.Disease Recognizer-Disease recognizer will test blood samples whether thesamples are contaminated or okay.Blood Processing Manager- They will take orders from the hospitals and fulfilltheir needed requirements of blood samples.Blood Sample- The quantities of blood that the Blood bank has. Their group,sample no, status will be stored.Hospital- Hospitals of each district, where blood samples are needed, also includedin the database.Blood Recipient- Who needs blood. A recipient’s id, name, age, sex, the bloodsample’s group information will be stored in database.6

Cardinality:District & Donor- (Relationship- (stays in), 1 to many). One donor stays in onedistrict. In one district, many donors can stay.Registration Staff & Donor- (Relationship-(registers), 1 to many). A staff canensure many donors’ registration. One donor can get registered by one staff.Registration Staff & Blood Recipient- (Relationship-(records), 1 to many). Astaff can ensure many blood recipients’ registration. One blood recipient can getregistered by one staff.District & Blood Recipient - (Relationship-(resides in), 1 to many). One recipientstays in one district. In one district, many recipients can stay.District & Hospital- (Relationship-(belongs to), 1 to many).In a district, there aremany hospitals. One hospital belongs to one district.Blood Processing Manager & Hospital- (Relationship-(submit orders to), 1 tomany). A blood processing manager can get orders from many hospitals. Onehospital submits order to a blood processing manager.Blood Processing Manager & Blood Sample-(Relationship-(processes sample),1 to many). A manager can process many samples of blood. One blood sample canbe processed by one blood processing manager.Disease Recognizer & Blood sample- (Relationship-(verify), 1 to many). Adisease recognizer can verify many blood samples. One blood sample is verifiedby one disease recognizer.Blood Processing Manager & Blood Recipient-(Relationship-(request to), 1 tomany). The samples of blood are given according to the necessity of the recipients,processed by the manager. A manager can process many samples of blood that arerequested by the recipients. But one recipient can request only one bloodprocessing manager.7

2.2 Relational SchemasDonorTable 2.2.1Attribute NamedNameDidSexAgedreg daters id (fk)dis id(fk)db grpDescriptionName of the donorId of the donorSex of the donorAge of the donorRegistration date of the donorId of the registration staffDistrict idDonor’s blood groupTypevarcharIntcharIntdateIntIntvarcharThe relationship with Registration staff and Donor is 1 to many. That’s why primary keyof Registration staff is used as a foreign key in Donor.The relationship with District and Donor is 1 to many. That’s why primary key of Districtis used as a foreign key in Donor.DistrictTable 2.2.2Attribute Name DescriptionTypedis idDistrict idIntdis nameName of the district VarcharRegistration StaffTable 2.2.3Attribute Namers idrs nameSexDescriptionId of the registration staffName of the registration staffSex of the registration staffTypeIntvarcharchar8

Blood RecipientTable 2.2.4Attribute NameRidSexAger regdateRnameb qntyrb grprs id (fk)dis id (fk)bm id (fk)DescriptionId of the recipientSex of the recipientAge of the recipientRegistration date of the recipientName of the recipientNeeded quantity of bloodRecipient’s blood groupId of the registration staffDistrict idBlood processing manager’s idTypeintcharintdatevarcharintvarcharintintintThe relationship with Registration staff and Blood Recipient is 1 to many. That’s whyprimary key of Registration staff is used as a foreign key in Blood Recipient.The relationship with District and Blood Recipient is 1 to many. That’s why primary keyof District is used as a foreign key in Blood Recipient.The relationship with Blood Processing Manager and Blood Recipient is 1 to many.That’s why primary key of Blood Sample is used as a foreign key in Blood Recipient.Blood SampleTable 2.2.5Attribute Nameb groupsample noStatusdrecog id (fk)bm id (fk)DescriptionBlood group of the sampleSample identification numberStatus of the blood sampleDisease Recognizer’s idBlood processing manager’s idTypevarcharintvarcharintintThe relationship with Disease Recognizer and Blood Sample is 1 to many. That’s whyprimary key of Disease Recognizer is used as a foreign key in Blood Sample.The relationship with Blood processing manager and Blood Sample is 1 to many. That’swhy primary key of Blood processing manager is used as a foreign key in Blood Sample.9

Disease RecognizerTable 2.2.6Attribute Namedrecog iddrecog nameSexDescriptionDisease Recognizer’s idDisease Recognizer’s nameDisease Recognizer’s sexTypeIntvarcharcharBlood Processing ManagerTable 2.2.7Attribute Namebm idbm nameSexDescriptionBlood processing manager’s idBlood processing manager’s nameBlood processing manager’s sexTypeintvarcharcharHospitalTable 2.2.8Attribute NameHidhb qntyhb grpHNameDescriptionHospital’s idNeeded quantity of blood in a hospitalNeeded blood groupHospital’s NameTypeintintvarcharvarchardis id(fk)District’s idintbm id(fk)Blood processing manager’s idintThe relationship with District and Hospital is 1 to many. That’s why primary key ofDistrict is used as a foreign key in Hospital.The relationship with Blood processing manager and Hospital is 1 to many. That’s whyprimary key of Blood processing manager is used as a foreign key in Hospital.10

2.3 2ormalizationBoyce Codd introduced a number of ‘normal forms’ (1970- 1972). They are principlesthat can hold for a given relation or not.The formal definition of Normalization is: it is the sequence of steps by which arelational database model is both created and improved upon. The sequence of stepsinvolved in the normalization process is called normal forms. Essentially, normal formsapplied during a process of normalization allow creation of a relational database model asa step-by-step progression.Normal Forms:First Normal Form (1NF): A relation is in first normal form if it contains only simple,atomic values for attributes, no sets; that is, if attributes do not have sub This relation is in first normal form because attributes do not have sub attributes.Second Normal Form (2NF): A relation is in second normal form, if it is in 1NF andevery non-primary key attribute is fully functionally dependent on the primary key of therelation.Example:Relation :( A, B, C, D){A} {B}{A} {C}{A} {D}It is in 2NF because it is in 1NF and every non-primary key attribute is fully functionallydependent on the primary key of the relation. [2]Third Normal Form (3NF): A relation is in third normal form, if it is in 2NF and nonon-primary key attribute is transitively dependent on the primary key.Example:Relation :( A, B, C, D, E )11

{A, B} {C}{A, B} {E}This relation is in third normal form because it is in 2NF and no non-primary keyattribute is transitively dependent on the primary key.Boyce-Codd Normal Form (BCNF): A relation is in BCNF, if for every full functionaldependency X Y holds: X is a candidate key. If part of primary key is fullyfunctionally dependent on non primary key, BCNF violation occurs.Example:Relation :( A, B, C, D ){A, B} {C, D}{C} {A}In 1NF, 2NF, 3NF, but not in BCNF. Because part of primary key, A is fully functionallydependent on non- primary key C. We have to split the original relation.( A, B, D ), ( C, A ).Now in BCNF.Advantages of normalization:i. Many unnecessary redundancies are avoided.ii. Anomalies with input, deletion and updates can be avoided.iii. Fully normalized, relations tend to need less space than if not normalized.Disadvantages of normalization:i.Normalization splits entities and relationships into many relations, thus making themharder to understand.ii. Queries become more complex because they have to involve more relations.iii. Response times are longer because of a higher number of joins in the queries. [2]2ormalization of Blood Bank database:1. Donor (dId, dName, sex, age, dreg date, rs id, dis id, db grp){dId} {dName} (functional dependency exists, because two different dNames do notcorrespond to the same dId).12

{dId} {sex} (functional dependency exists).{dId} {age} (functional dependency exists).{dId} {dreg} date (functional dependency exists).{dId} {rs id} (functional dependency exists).{dId} {dis id} (functional dependency exists).{dId} {db grp} (functional dependency exists).The relation is in 1NF because its attributes do not have sub attributes.The relation is in second normal form, as it is in 1NF and every non-primary key attributeis fully functionally dependent on the primary key of the relation.The relation is in third normal form, as it is in 2NF and no non-primary key attribute istransitively dependent on the primary key.No part of primary key is fully functionally dependent on non-primary key. So, therelation is in BCNF2. District (dis id , dis name){dis id} {dis name}The relation is in 1NF.The relation is in second normal form.The relation is in third normal form.The relation is in BCNF.3. Registration staff (rs id, rs name, sex){rs id} {rs name} (functional dependency exists).{rs id} {sex} (functional dependency exists).The relation is in 1NF.The relation is in second normal form.The relation is in third normal form.The relation is in BCNF.4. Blood recipient (rId, sex, age, r regdate, rName, b qnty, rb grp, rs id, dis id, bm id){rId} {sex} (functional dependency exists).{rId} {age} (functional dependency exists).{rId} {r regdate} (functional dependency exists).{rId} {rName} (functional dependency exists).{rId} {b qnty} (functional dependency exists).13

{rId} {rb grp} (functional dependency exists).{rId} {rs id} (functional dependency exists).{rId} {dis id} (functional dependency exists).{rId} {bm id} (functional dependency exists).The relation is in 1NF.The relation is in second normal form.The relation is in third normal form.The relation is in BCNF.5. Blood Sample ( b group, sample no, status, drecog id, bm id ){b group,sample no} {status} (functional dependency exists).{b group,sample no} {drecog id} (functional dependency exists).{b grou

Blood_Sample- The quantities of blood that the Blood_bank has. Their group, sample_no, status will be stored. Hospital- Hospitals of each district, where blood samples are needed, also included in the database. Blood_Recipient- Who needs blood. A recipient’s id, name, age, sex, the blood sample’s group information will be stored in database.

Related Documents:

TABLE OF CONTENTS 3 BLOOD CULTURE ESSENTIALS p. 2 1 What is a blood culture? p. 4 2 Why are blood cultures important? p. 4 3 When should a blood culture be performed? p. 5 4 What volume of blood should be collected? p. 6 5 How many blood culture sets should be collected? p. 8 6 Which media to use? p. 10 7 Timing of blood cultures p. 11 8 How to collect blood cultures p. 12

A blood can receive blood from a person with type A or type O. A person with type B blood can receive blood from a person with type B or type O. A person with type B blood can donate blood for persons with either type B or type AB blood. Actually, blood banking is more complicated than this simple description, with test run for other minor

enable mining to leave behind only clean water, rehabilitated landscapes, and healthy ecosystems. Its objective is to improve the mining sector's environmental performance, promote innovation in mining, and position Canada's mining sector as the global leader in green mining technologies and practices. Source: Green Mining Initiative (2013).

DATA MINING What is data mining? [Fayyad 1996]: "Data mining is the application of specific algorithms for extracting patterns from data". [Han&Kamber 2006]: "data mining refers to extracting or mining knowledge from large amounts of data". [Zaki and Meira 2014]: "Data mining comprises the core algorithms that enable one to gain fundamental in

Preface to the First Edition xv 1 DATA-MINING CONCEPTS 1 1.1 Introduction 1 1.2 Data-Mining Roots 4 1.3 Data-Mining Process 6 1.4 Large Data Sets 9 1.5 Data Warehouses for Data Mining 14 1.6 Business Aspects of Data Mining: Why a Data-Mining Project Fails 17 1.7 Organization of This Book 21 1.8 Review Questions and Problems 23

Data Mining and its Techniques, Classification of Data Mining Objective of MRD, MRDM approaches, Applications of MRDM Keywords Data Mining, Multi-Relational Data mining, Inductive logic programming, Selection graph, Tuple ID propagation 1. INTRODUCTION The main objective of the data mining techniques is to extract .

antigen. And if it were blood type B, the blood would not be agglutinated when mixing with anti-A antibodies. Thus, in the blood, there was only A antigen that could identify it was a blood type A. Also, it could not be blood type O because there would be no changes in all circles—no blood coagulation—if it were blood type O.

by andrew murray chapter contents i. what the scriptures teach about the blood ii. redemption by blood iii. reconciliation through the blood iv. cleansing through the blood v. sanctification through the blood vi. cleansed by the blood to serve the living god vii. dwelling in "the holiest" through the blood viii. life in