MidTerm Report
Relationship Finder between Individual
Sharukh Malik (BC083009) Fahad Jamshad (BC083011)
Table of contacts
1. Introduction 1.1 1.2 2. Purpose Scope 5 5 5 6 6 6 7 7 7 7 8 9 9 9 9 9 9 9 10 10 10 10 10 11 11 11 11 Error! Bookmark not defined. 11 11 12 12 12 12 13 13 13 13 Error! Bookmark not defined. 13 13 14 14 14
Page 3
3.
Stakeholder and Customer Descriptions 3.1 3.2 3.3 Market Demographics Stakeholder Summary User Environment
4. 5.
Use Case Diagram Extended Use cases 5.1 Use case 01 5.1.1 Use case name: Data Collection from DBLP 5.1.2 Use case reference number: UC 4.1 5.1.3 Actors: System 5.1.4 Author: Sharukh Malik 5.1.5 Goals and Brief description: Precondition 5.2.1 Postconditions 5.2.2 Flow 5.2.3 Event Table 5.2.4 Alternative or Exceptions: Use case 02 5.3.1 Use case name: Data Collection from Citseer 5.3.2 Use case reference number: UC 4.2 5.3.3 Actors: System 5.3.4 Author: Sharukh Malik 5.3.5 Goals and Brief description: 5.3.6 Precondition 5.3.7 Post conditions 5.3.8 Flow 5.3.9 Event Table 5.3.10 Alternative or Exceptions: Use case 03 5.4.1 Use case name: Data Collection from Web 5.4.2 Use case reference number: UC 4.3 5.4.3 Actors: System 5.4.4 Author: Sharukh Malik 5.4.5 Goals and Brief description: 5.4.6 Precondition 5.4.7 Post conditions 5.4.8 Flow 5.4.9 Event Table
5.2
5.3
5.4
Sharukh(BC083009) Fahad(BC083011)
5.5
5.6
5.7 6.
5.4.10 Alternative or Exceptions: Use case 04 5.5.1 Use case name: Text parsing 5.5.2 Use case reference number: UC 4.3 5.5.3 Actors: System 5.5.4 Author: Sharukh Malik 5.5.5 Goals and Brief description: 5.5.6 Precondition 5.5.7 Post conditions Use case 05 5.6.1 Use case name: Find common entities 5.6.2 Use case reference number: UC 4.3 5.6.3 Actors: System 5.6.4 Author: Sharukh Malik 5.6.5 Goals and Brief description: 5.6.6 Precondition 5.6.7 Post conditions Use case 06
Functional and Nonfunctional Requirements 6.1 6.2 Functional Requirements Nonfunctional Requirements
Sharukh(BC083009) Fahad(BC083011)
Page 4
1.
Introduction
Finding relationship between individuals in an important practical problem. The discovery of such relationship is crucial and has applications in number of areas and domains such as: scientific community, enterprises, social networks, security agencies, terrorist activities, and job placement etc. Some of the scenarios are listed below: 1) finding conflict of interest in peer review setting, 2) finding a relationship of person X with a terrorist/criminal person Y, 3) finding a common friend between you and person X for establishing a collaboration between person X, 4) finding a suitable friend of yourself or a person who is an influential friend of person X when you want to approach person X, 5) discovering candidate friends of a person X for a social network (for example face book, LinkedIn etc), and 6) finding a cluster/groups of close friends connected for a certain reason etc. The current project tries to solve the subset of the problem by designing and implementing generic techniques. A variety of algorithms, heuristics, and approaches will be integrated on number of available datasets to design such system. An initial effort was made by Aleman-Meza et al [1] where they applied their algorithms to identify conflict of interests in the peer review setting. However, the current project will substantially extend the work by collecting evidences of relationships between individuals from number of sources for example co-authorships, social networks, semantic versions of friends ontologies, and home pages from the World Wide Web. The co-authorship data will be collected from DBLP1, Cite Seer2, JUCS3, and ACM digital library4, 2) The social network data will be collected from Face book5, 3) Friends data will be collected from FOAF ontology [3]. Subsequently, the home pages of the individuals will be extracted and heuristic approach along with NLP algorithms will be used to discover structured profile of the individuals. One of the biggest challenges to find relationship between individuals is to disambiguate individual from diverse sources of information. For example, a person may be listed using number of combinations such as full name, initial of first name and last name etc. To discover that the both entities represent the same entity is known as entity disambiguation. For this we will use NameReconciliation Algorithm [2]. This algorithm tries to collect enough evidences between two resources to conclude that the both entities represent the same resource.
1.1
Purpose
The purpose of the Finding relationship between individual, is find relationship between individual without knowing them.
1.2
Scope
The scope of the project is to identify relationship between individuals. Furth more, the relationship is categorized in the levels between highest to low depending upon the evidences collected from different sources. The project integrates datasets from the following sources: DBLP, Cite seer, FOAF, J. UCS, ACM, Home pages from the Web, and social networks (face book)
Sharukh(BC083009) Fahad(BC083011)
Page 5
2.
Positioning
2.1
Problem Statement The problem of Affects the impact of which is a successful solution would be Finding a relationship between person is a common problem Person in any field Time saving, Implicit/hidden relationship finding, To discover hidden relations efficiently so that decisions can b made easily.
2.2
Product Position Statement For Who The (product name) That Unlike Person in any field searching for computer scientist Researcher, Businessman, educationist etc Relationship finder between Individual helps them to save their time in searching and finding hidden relationships between individuals They visit any possible website ,or other sources for this purpose
Sharukh(BC083009) Fahad(BC083011)
Page 6
3.
Stakeholders of Finding relationship between individual are those user who want to find relationship between two personalities. 3.1 y y y y Market Demographics Finding a relationship of person X with a terrorist/criminal person Y. Finding a common friend between you and person X for establishing a collaboration between person X. Finding a suitable friend of yourself or a person who is an influential friend of person X when you want to approach person X. Discovering candidate friends of a person X for a social network (for example face book, LinkedIn etc)
3.2
Stakeholder Summary
Stakeholder of Relationship Finder between individuals can be belong to any field related to computer science.
3.3
User Environment
[Detail the working environment of the target user. Here are some suggestions: C# would be the language to build this tool. Platform independent is concern for this tool.
Sharukh(BC083009) Fahad(BC083011)
Page 7
4.
Sharukh(BC083009) Fahad(BC083011)
Page 8
5.
5.1
Use case 01
5.1.1
5.1.2
5.1.3
Actors: System
5.1.4
5.1.5
This use case begin when user enter the name of the person for finding its data or give the name of the two person to find the relationship between them. The system will find the data from database (DBLP) and display the result against that query to the user. If any publications or research paper of the person is present in the data base (DBLP) system will display them.
Sharukh(BC083009) Fahad(BC083011)
Page 9
5.2
Precondition
Application Running The Relationship finder between individuals application should be running. mySQL Connection The connection to the mySQL database should be established.
5.2.1
Postconditions
After the system performed basic flow of this use case then it should describe following states.
Papers List
5.2.2
Flow
Actor Action 1. Users enter the name of the person or two personalities. 3. Search that name in DBLP in the author list. 5. Match found.
System Response 2. Validates person name. 4. Search Until the match is found. 6. Display the result.
5.2.3
Event Table
Trigger Search
Source DBLP
Destination System
Alternative or Exceptions:
If no match found then display to the user no author exist. If multiple results are found then display to the user to find the correct author name.
Sharukh(BC083009) Fahad(BC083011)
Page 10
5.3
Use case 02
5.3.1
5.3.2
5.3.3
Actors: System
5.3.4
5.3.5
This use case begin when user enter the name of the person for finding its data or give the name of the two person to find the relationship between them. The system will find the data from database (Citseer) and display the result against that query to the user. If any publications or research paper of the person is present in the data base (Citseer) system will display them.
5.3.6 Precondition
Application Running The Relationship finder between individuals application should be running. mySQL Connection The connection to the mySQL database should be established.
Sharukh(BC083009) Fahad(BC083011) Muhammad Ali Jinnah University Page 11
5.3.7
Post conditions
After the system performed basic flow of this use case then it should describe following states. Papers List List of papers published by that author. Common publications by both persons.
5.3.8
Flow
Actor Action 1. Users enter the name of the person or two personalities. 3. Search that name in DBLP in the author list. 5. Match found.
System Response 2. Validates person name. 4. Search Until the match is found. 6. Display the result.
5.3.9
Event Table
Trigger Search
Source Citseer
Destination System
If no match found then display to the user no author exist. If multiple results are found then display to the user to find the correct author name.
Sharukh(BC083009) Fahad(BC083011)
Page 12
5.4
Use case 03
5.4.1
5.4.2
5.4.3
Actors: System
5.4.4
5.4.5
This use case begin when user enter the name of the person for finding its data or give the name of the two person to find the relationship between them. The system will find the data from web and display the result against that query to the user. If any publications or research paper of the person is present on the web system will get it and do different process on that data.
5.4.6 Precondition
Application Running
Sharukh(BC083009) Fahad(BC083011) Muhammad Ali Jinnah University Page 13
The Relationship finder between individuals application should be running. mySQL Connection The connection to the mySQL database should be established and internet connection should be established.
5.4.7
Post conditions
5.4.8
Flow
Actor Action 1. Users enter the name of the person or two personalities. 3. Search that name in DBLP in the author list. 5. Match found.
System Response 2. Validates person name. 4. Search Until the match is found. 6. Display the result.
5.4.9
Event Table
Trigger Search
Source Citseer
Destination System
If no match found then display to the user no author exist. If multiple results are found then display to the user to find the correct author name.
Sharukh(BC083009) Fahad(BC083011)
Page 14
5.5
Use case 04
5.5.1
5.5.2
5.5.3
Actors: System
5.5.4
5.5.5
This use case begin when the data from web is collected. The purpose of text parsing is to filter the data. Exclude the data which is not for our use. Take the specific data find text in that data , so that can be use for finding the relationship between the defined personalities.
Sharukh(BC083009) Fahad(BC083011)
Page 15
5.5.6
Precondition
Application Running The Relationship finder between individuals application should be running. web Connection The connection to the mySQL database should be established and internet connection should be established. Data collected Data should be collected from web of the given person.
5.5.7
Post conditions
System will give data that can b use for fining the relationship. It can be common entities.
5.6
Use case 05
Sharukh(BC083009) Fahad(BC083011)
Page 16
5.6.1
5.6.2
5.6.3
5.6.4
5.6.5
This use case begin when the data from web is collected. The purpose of Find common entities is to filter the data and finding the common feature like education, publication and geographical region , so that can be use for finding the relationship between the defined personalities.
5.6.6 Precondition
Application Running The Relationship finder between individuals application should be running. web Connection The connection to the mySQL database should be established and internet connection should be established. Data collected Data should be collected from web and the given data bases of the given person.
5.6.7
Post conditions
System will give data that can b use for fining the relationship. It can be common entities.
Sharukh(BC083009) Fahad(BC083011)
Page 17
5.7
Use case 06
Sharukh(BC083009) Fahad(BC083011)
Page 18
6.
6.1
Functional Requirements
Ref#
Ref.1
Functions
Data of the person should be present in the database or on web Establish the connection with available database. System will collect the data from DPLP. System will collect the data from Citseer. System will collect the data from Web. System will disambiguate the author. System will show the list of publication of the person to the user. System will acquire the web page. System will do text parsing, and take that data from web site that is useful. System will find the common entities between two persons. Like education, publication or geographical region. System will categories the relation between two person with in defined 3 or 4 level.
Category
Ref.2
Ref.3 Ref.4
Ref.5
Ref.6 Ref.7
Ref.8 Ref.9
Ref.10
Ref.11
Sharukh(BC083009) Fahad(BC083011)
Page 19
6.2
Nonfunctional Requirements
Ref# 3.1
Ref# 4.1
Ref# 5.1
Ref# 10.1
System will show the data that is collected from DBLP to the User. System will show the data that is collected from Citseer to the User. System can ask user for confirmation of author during disambiguation. System can ask the user to conform the entities like Education, publication or geographical region
Sharukh(BC083009) Fahad(BC083011)
Page 20