Anda di halaman 1dari 47

D

Vi
Creative
Dyna
isua
and intera
Fin
Na
De
St
A
CiscoAdv
amic
lisat
Net
active visu
nal Pr
ationalUniv
epartmento
tudent:Sea
AcademicAd
visors:Ferg

c Tra
tion
two
ualisations

roject
versityofIre
ofInformati

anBrycelan
dvisor:Dr.
gusDeffely
acki
n of S
orks
s applied to
t Repo
eland,Galw
onTechnolo
d(0432542
HughMelv
andShane
ng &
Soci
o peer-to-
ort
way
ogy
27)
vin
Dempsey
&
ial
-peer chat
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |2
ACKNOWLEDGEMENTS
IwouldliketooffermythankstoDr.HughMelvinforhiscontinuedadvice,encouragementandguidance
throughoutthedurationofthisproject,withoutwhichIwouldneitherhavedeliveredsuccessfullyoron
schedule.Itwasgreatlyappreciated.
IalsooffermygratitudetobothFergusDeffelyandShaneDempseyofCISCOSystems,Galway.Their
invaluableknowledgeandexpertiseinthefieldofcorporate/socialcommunication,combinedwiththeir
aspirationstoseecreativeandadaptiverepresentationsappliedtothisenvironmentwasagreatmotivation.
ToallmyfriendsIwishtooffermyneverendingappreciation.Youhavekeptmesane,andinspiredme.
Finallyaspecialwordofthanksgoestomyfamily.Youhaveneverstoppedbeingamuchneededrayoflight,
supportingmethroughoutthecourseofthisproject,andmylastfouryearsofUniversity.


V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |3
CONTENTS
Acknowledgements................................................................................................................................................2
ListofFigures..........................................................................................................................................................4
ListofTables...........................................................................................................................................................5
1.ExecutiveSummary............................................................................................................................................6
2.Introduction........................................................................................................................................................7
3.LiteratureReview&BackgroundDetail..............................................................................................................9
4.Design...............................................................................................................................................................14
5.Technologies.....................................................................................................................................................25
6.Implementation................................................................................................................................................28
7.Results..............................................................................................................................................................36
8.Conclusion........................................................................................................................................................41
References............................................................................................................................................................43
AppendixA............................................................................................................................................................46
AppendixB............................................................................................................................................................47


V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |4
LISTOFFIGURES
Figure1:Userrelationshipbridging.......................................................................................................................6
Figure2:Userscontactsandtheirtaggroups.......................................................................................................8
Figure3:Simpleoverviewofthesystem&itscomponents................................................................................14
Figure4:Earlymockupofchatclient...................................................................................................................15
Figure5:XMPPServerinterceptingmessages.....................................................................................................16
Figure6:DatabaseDesign....................................................................................................................................18
Figure7:GraphicDemonstrationofVisualisation................................................................................................21
Figure8:FrequenciesbeforeProcessing..............................................................................................................22
Figure9:Frequenciesafternormalisation............................................................................................................23
Figure10:Frequenciesaftersmoothing...............................................................................................................23
Figure11:Finalweightsasderivedfromtransformation....................................................................................24
Figure12:XMLEnvelope......................................................................................................................................25
Figure13:LoginScreen.........................................................................................................................................28
Figure14:ContactsScreen...................................................................................................................................28
Figure15:ConversationWindow.........................................................................................................................29
Figure16:Anoverviewofthewebserverenvironment......................................................................................32
Figure17:UsingTrigonometrytodeterminecentrepoints.................................................................................33
Figure18:Theflowofmovement........................................................................................................................34
Figure19:Usersandtheircontacts......................................................................................................................35
Figure20:TagCloudfortheterm'Project'..........................................................................................................37
Figure21:TagCloudfortheterm'demo'.............................................................................................................37
Figure22:FlashvisualisationforthetermProject...............................................................................................38
Figure23:Flashvisualisationfortheterm'demo'...............................................................................................38
Figure24:Visualisationofbothafixeduserandatag.........................................................................................39
Figure25:Contactmap........................................................................................................................................40


V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |5
LISTOFTABLES
Table1:DemonstrationofVisualisationProcess.................................................................................................21
Table2:FrequenciesandweightsforusersusingthetagProject.....................................................................36
Table3:Frequenciesandweightsforusersusingthetag'demo'........................................................................37

V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |6
1. EXECUTIVE SUMMARY
This project sets out to examine the methods of dynamically tracking, storing and visualising data across
corporatesocialnetworkswithspecificregardtoInstantMessagingandPresenceNetworks.
By applying creative and interactive visualisations to stored data, it is hoped that increased value can be
derived from the benefits already provided by such a network. Increased value, in this case, is the
identificationandutilisationofrelationshipsandknowledgerepositoriesthatexistwithinanorganisation.
The project will examine social networking and its context within the realms of a business environment, in
specific Cisco and their Unified Presence (CUP) system. CUP is an information collection layer that runs atop
Ciscoscommunicationservices,combiningdatafrommanysources(seeCiscoSystems,Inc.,2007),including
IM,desktopphones,mobilesandcalendars;thisinformationisthencollatedtocalculateauserspresence,for
exampleinameeting.Ciscorealisedthecapabilitiesandextenttowhichtheycouldcaptureinformationon
an organisational level, leadingto theconceptionof thisproject to look at ways to further derive value from
that information via visualisation. In particular Cisco wished to explore relationships between users of an
organisationscommunicationnetwork,muchliketheENRONExplorer(TrampolineSystems,2005).
Working with Cisco and my NUI, Galway project supervisor, I devised a Project Definition that would explore
the possible methods of data collection and visualisation. The project would focus on allowing users of a
corporate XMPP (Extensible Messaging and Presence Protocol) instant messaging network, to converse with
their peers while at the same time tagging their conversations with relevant keywords. In tagging these
conversationsweightedassociationsrelatingtocertaintopicswouldbecreatedbetweenusers.Byestablishing
this web of relationships the project would then focus on providing ways at which to visualise such
information,inawaythatwasnotonlyintuitivebutbeneficialtousers.
Theinitialbenefitofidentifyingsuchspecificrelationshipsbetweenuserscreatesaninvaluableresourcethat
can be utilised by an organisation and its members to great effect. The original basis when developing the
projects definition, was that such a system would be useful in identifying and bridging any communication
gaps,forexample:asseeninFigure1,SamtalkstoJoeaboutJAVA;JoetalkstoBrianaboutJAVA;Joeisonsick
leave;SamusingthesystemwouldbeabletoidentifythatJoealsotalkstoBrianaboutJava,andcaninturn
getintouchwithBrianherself.

Figure1:Userrelationshipbridging.
However,asexploredlaterinthisdocumentotherbenefitsarederivedfromtheexistenceandvisualisationof
such a relational network, specifically the creation of an organisation wide functional matrix, identifying
knowledge rich groups within an organisation. This methodology will allow full advantage to be taken of the
completeknowledgeeconomywithinsaidorganisation,simplybyidentifyingit.
Thevisualisationsystemwouldalsoallowforeasyadaptationtoallowforothernoncontextualrelationstobe
displayed,suchaslocationorrole.

V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |7
2.INTRODUCTION
INSTANT MESSAGING& VISUALISATION
InstantMessaging(IM)hasbecomeamainstayofinternetusageoverthepast5years,withfiguresfrom2005
showing 70% of all internet users use IM, with 26% of all IM users using IM at work, (America Online, Inc,
2005).ConsequentlyasIMbecomesmoreofamainstreamoptionwithinthecorporateenvironment,sotoo
growthepossibilitiesforinformationcollectionandusagewithinthatenvironment.
Inparticularitispossibletocollectalargemassofrelationalinformationfromusersonthenetworksimplyby
identifying who they are connected with. Indeed a vast amount of research has been done on the links
between people within social networks. This information in itself is extremely valuable in identifying strong
linksandpathsbetweenusers,butwhatofthenatureoftheselinks?Iftheseconnectionswerescrutinisedat
more than face value, drilled down to a level of topic and context, a network of users could be seen and
examined in many new lights. The resulting information would allow users and networks of people to be
identifiedonthebasisofknowledgeandintelligenceasopposedtoformallinearstructures.
Thisprojectaimstotaketheconceptoflinkageina socialnetwork,specificallyanIMnetworkandapplythe
contextuallayer,tobothdevelopandvisualisetheknowledgeeconomywithinanetwork.
KNOWLEDGE ECONOMY
Acommonexpressionisthatweliveinaknowledgeeconomy;thisphraseisusedinreferencetocountries,
andindeedbusiness.However,isitusedcorrectlywithreferencetoorganisationsandtheirculture?
Atpresentwithinanorganisationnetworkingbetweenpeoplegenerallydoesnotoccuratall levelsorwithin
the organisation at all; networking being an activityusually reserved for higher level managementand on an
externalscope.Thisobviouslyisnotanissuewithsmallbusinesses;however,itdoesimposelimitingfactorson
medium&largebusinesses.Forexample:
AdeveloperonateaminGalwaymightbeworkingonaproject,andgiventhetaskofdevelopingapluginina
fairly new language. Fortunately, he had been put in touch with an employee in the Chicago offices who had
been doing a lot of personal research on and had great interest in that language. However, the Chicago
employeehadtakenillandwentonsickleave,whiletheGalwayemployeewasstrugglingtomakeanyinroads
with the language. Due to the employee not being able to identify any other help within the organisation,
deadlines were missed. It is likely that the Chicago employee had talked about this subject before with many
otheremployeesinthefirm,insimilarsituations,howevertheGalwayemployeehadnowayoflinkingintothat
network.
Currentlyorganisationstendtokeeplimitedrequiredinformationonwhoitemploys,whattheyspecialisein,
andtheirpositionwithintheoverallstructure,althoughgenerallytheydonotkeeparecordofanysuperfluous
data such as interests or pastimes, be they related to that organisations enterprise or not. If in the above
example the organisation had access to such data, the Galway developer would have been able to identify
otherswithintheorganisationtoassisthimandachievethedeadlinesset.
This project looks at a means of removing potential barriers caused by an organisations structure and
determining a nonhierarchical map of employees based on knowledge of topics, be they industry related or
more social topics. This will then allow the exploration of an organisation as an intelligent entity, no longer
relying on job descriptions to identify relevant employees for a topic, but calling up all employees who have
shown any interest or experience with it. As will be seen later such data mining potentially can bring many
benefitstoanorganisation.
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |8
EXTRACTING INFORMATION FROM AN IM NETWORK
This system will derive the context and knowledge behind links and conversations within the IM network via
tagging.Tagging,aconceptheraldedintopopularitywiththeWeb2.0movementisanaturaluserorientated
progressionfromoldersystemorientatedmetadatasuchaskeywords.Intheconfinesofthisproject,theuser
wouldstartupaconversationwithanotheruserandthentagthatconversationbasedonthecontentofthe
conversation. The other user then agrees to or changes the tags chosen, and information on the link,
specifically, that conversation, between the two users is committed to the visualisation systems database by
thechatserver.
Tagging is of course a voluntary activity performed by the user, but provides a certain level of privacy and
interpretationthatautomaticanalysingcannot.
By applying this process to all conversations across an entire corporate IM network (See Appendix B), a vast
amount of information on links between people based on topic will rapidly be developed. Once stored, the
informationwillallowforeasyfilteringandinturnvisualisation,toharnessthedatasfullpotential.
VISUALISING INTELLIGENCEBASED RELATIONSHIPS
The constantly changing nature of an IM network on top of
the level of relational data being stored and derived by the
systemprovidesanumberofdisplayproblems.Thefactthat
auserisconnectedtomanyotherusersdirectly;whileitcan
alsobeviewedthattheuserconnectstoomanyothersusers
via many topics (See Figure 2). This called for not only an
innovativebutadynamicandinteractivevisualisationofthe
data.Thevisualisationwouldbedynamicinthatitwouldbe
generated each time it was shown, based on the most
current data from the IM network, and interactive by
allowing the user to drill down into the visualisation were
neededallowingeasyfilteringandtraversalofdata.
Bydevelopingboththetagcollectionsystemandthevisualisationsconcurrently,thisprojecthadgreatscope
androomforexperimentation.
STRUCTURE OF FYP
AsoutlinedinthisintroductionthereissoundbasisforfirstofallcollectingconciserelationaldatafromanIM
network, and indeed a need to visualise such data. This report therefore consists of two main sections,
RationaleandDevelopment.
The Executive Summary and this introduction begin to explain the rationale for this project, but it will be
furtherdiscussedinchapter3,aliteraturereviewoftheelementsofthissystem.Itdiscussestheneedforand
benefits of developing and exploring networks within an organisation, adapting those theories to the instant
messagingmodel,instantmessagingprotocols,andofcoursevisualisationtechniques.
Chapter 4 introduces the design of the project, a highend objective based dissection of the project, its
contributing systems, its theory and ethical implications. Chapter 5 will describe the technologies chosen for
thesystem,andwhytheywerechosenwherealternativesexisted.Thesystemsimplementationisoutlinedin
chapter 6, with detail of problems that arose, and their solutions. Chapters 7 and 8 review results of the
projectanddrawconclusions.

Figure2:Userscontactsandtheirtaggroups
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |9
3. LITERATURE REVIEW &BACKGROUND DETAIL
Asmentionedinchaptertwo,networkingisanessentialaspectofanyorganisationsculture,beitnetworking
withoutsideforces,orasislesscommonbutallthewhilethemorevaluableofthetwo,internalnetworking.
The onus of this project is to take advantage of existing primitive internal networks; in particular IM
networks,andfromthose,deriveanddisplay,richandusefulinformationthatwillincreasethevalueofthose
networks,turningthemintointelligentreferentialmaps.
This chapter aims to outline the literary backing for both the need of such knowledge collection and
exploration,whilealsooutliningrelatedworkinthefieldofvisualisation.
THE PEOPLE NETWORK
The needs ofpeople withinan organisation have been studied for many decades with Maslows hierarchy of
needs (A Theory of Human Motivation, Psychological Review, 1894) discussing the needs and wants of
workers,andEltonMayosexplorationofinteractionbetweenworkersinanorganisationwithdirectrelation
to productivity (Hawthorne and the Western Electric Company, The Social Problems of an Industrial
Civilisation, 1949). Both these studies played huge roles in changing the way organisations interact with and
treatemployees.Whereastodayweliveinaworldwheremostorganisationsnowfullyrealisethatitistheir
peoplethatpowerandmakeasuccessofit.Withthisinmindthereisadrivetodevelopthesepeopleandgain
their full potential, primarily by creating environments that will foster employee development. Building on
thesetheories,thisproject looksat waysoffurtherrealisingthetruepotentialofanorganisationspeopleby
analysing and making use of information collected through communication methods already in use to
determinethecollectiveintelligenceofandfurtherdeveloplinkagewithinthatnetwork.
Gilchrist(2004)statedthatNetworksarepresentedasaneffectivemodeoforganisingincomplex,turbulent
environments and play an important role in the development of successful coalitions and partnerships; this
view falls perfectly within the brackets of what one would come to expect of high level management
outreaching to forces external to their organisation, while at the same time this theory can also be
successfully applied on an internal basis, between employees. As mentioned earlier the clear necessity and
valueofsuchnetworkingtakingplacebetweenemployeesinsmallerbusinessesisnotrealisedtothepotential
it would be in larger organisations, especially those that consist of different staff groups and teams.
Networkingonaninternalbasisnotonlyallowsafreeflowofconceptsandideasbetweendifferentlocations
andgroupingsofpeople,butitisalsocompletelydetachedfromtheformalbarriersandhierarchiesthatmay
exist within an organisation. This networking approach to empowerment is developed using a circuits of
power model that emphasises the value of boundaryspanning work in promoting cohesion and managing
diversity(Gilchrist,2004),showingthatthroughinvestinginanddevelopinganetworkofpeopleaswellasa
hierarchy,anorganisationcanreaptherewardsofanengaged,empoweredworkforce.
Thisprojectisbasedupononalreadyexistentlateralcommunicationnetworks,whichmayonlyslightlydivert
fromthelayoutofahierarchy;forinstance,adevelopermayonlybeinregularcontactwithmembersofhis
teamanddirectmanagement.Suchascenariocreatesproblems,inthatanorganisationmaycontainhundreds
of employees, but each of these employees only have a handful of people they either know or can turn too,
effectively not utilising the resources of information and experience within that exist within an organisation.
This reason identifies the need to take these rigid structures and apply upon them the many layers of
relational information that traverse and already exist within the organisational structures, in the hope to
furtherdevelopandtakeadvantageoftheserelationalnetworks.Byfirstofallidentifyingthefullextentofthe
spreadofknowledgeandexperienceacrossanetwork,wecanthenminetheresultstofindinformationthat
willbebeneficialtootheremployees.Thispresentstheproblemofhowdowemakeuseofsuchinformation,
turningmassesofdataintovalueforbothanorganisationanditsemployees?
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |10
META NETWORKING,KNOWLEDGE SYSTEMS& PEOPLE ROUTING
The need for good networking across an organisation and its employees has been identified, but how do we
take already existing systems and develop those to be tools that support the continued development and
valuecreationofinternalpeoplenetworks?
Gilchrist (2004) described such networking of people as metanetworking, a methodology about supporting
and shaping the web of connections that weaves across the communities and weaves them into the wider
world.Metanetworkinginvolvestheusualskillsandprocessesofnetworkingsuchasmakingcontact,finding
connections, crossing boundaries, building relationships and interpersonal communication. With the
developmentofthisprojectmetanetworkingwillbesupportedbyactualtools,asopposedtoremaininginthe
realmsoftheory,allowinguserstonot onlyidentifythesubgroupstowhichtheybelong,butfurtherexpand
them.ByexaminingrelationshipsbetweenusersAandB,wecanfindanumberofpeoplerelatedtoBthatdo
not know A, and likewise people related to A that do not know B. By bridging this relational gap, and
introducingthesepeople,userscanfurtherdeveloptheirownnetworks.Inpracticesuchconceptshavebeen
applied simply on the basis of a users connections, however, with this project identification of users in an
extended network is based on context, ensuring users are able to develop their networks by adding like
mindedpeople,bringingnotonlyvalueandworthbutrationaletotheprocess.
By examining the information collected from the lateral IM network, it is hoped that we can appropriately
displaythisinformationtousersinaidofdevelopingtheirmetanetworks.Thiswillbedonethroughtheusers
ofthesystemidentifyingthelinksbetweenthemselvesandtheircontacts,informationvolunteeredinbenefit
of developing their own, and the networks of others, with metanetworking main concept being about the
work involved in supporting and transforming other peoples networks. (Gilchrist, 2004). This is discussed in
furtherdetaillater.
Mapping of an organisation occurs mainly through examining its functional matrix, a fixed structure that
constitutes predefined groups and the people within those. Through identifying the knowledge based
relationshipsbetweenpeopleitispossibletomapanorganisationinadynamicfluidform,wheregroupsare
now knowledge systems and each employee can be a part of many of these systems, simply through
expressing an interest or being an expert. By examining a mass of people in such a way, is to basically look
upon an organisation as one large operating unit with many combined intelligences, while still remembering
that these intelligences are made up of the individuals themselves and not through the efforts of the
organisation.Primarilythispresentsitselfasahugevaluetotheusersofthenetworkinidentifyingothersof
likemindtodiscusswith,debatewithandmentor,whileatthesametimeidentifyingtotheorganisationasa
whole many resources that previously did not exist. By not only examining individuals as elements of the
whole system and their predefined subsystems, we can apply a new layer of subsystems that are both
organicallycreatedandmaintainedbytheindividuals,creatingacertainlevelofadhocracy.Withthisapproach
anorganisationcanbegintolookattopicsofintelligenceaselementsofasystem,derivingvaluebyidentifying
peopleandgroupswhocollectivelypossessindepthknowledgeonasubject.Fortheemployees,theusersof
the system, they collectivelycontribute to these organic subsystems, bringing their own views andconcepts
tootherlikemindedmembers.Thisentireconceptoftheknowledgesystemallowsforthefulldiversification
of employees from the fixed structures they exist in fulltime to these flexible organic systems parttime.
Harrington (1991) also explored this concept of selective membership, the individual is thus an element
within an organizational[sic] system, part of its many subsystems feeding into and extracting out of them
throughtheappropriatenetworks.
To the employee already existing within an organisation this presents an invaluable raison dtre. By
voluntarily taking part in and contributing to knowledge systems, allows employees to further satisfy their
cognitiveneeds,asoutlinedbyMaslow(1894).
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |11
Totheorganisation,suchknowledgesystemspresenthugequantitiesofknowledgethathavepreviouslylaid
unused,however,withtheidentificationanddevelopmentofsuchsystemsanorganisationmustrealisethese
arewhollyuncontrollableorganicnetworks.Thishasanumberofbeneficialandproblematicimplications.By
beinganorganiccollectionoflikemindedpeopletherelationalnetworksbecomeselfmanagedaspeoplejoin
and leave at their own pace, this means that management will not have to encourage or discourage
membership,andatthesametimeidentifyemployeeswhohaveknowledgeandexperienceonsubjectsthat
mighthavebeenthoughtlostwithstaffturnover.ThisconceptofrobustnetworkswasexploredbyHarrington
(1991), Individuals per se are merely elements within the structure. They come and go, bring in different
qualities and abilities but the structure on the whole, with a few exceptions continue unchanged, and
Gilchrist (2004) They (Networks) are more flexible, less hierarchical and therefore more responsive to
unexpectedshiftsintheenvironment.Thisexamplealoneidentifiesoneofthesinglemostbeneficialreasons
for such systems. Also presenting itself through the sheer dynamism of these knowledge systems is an
organisation wide zeitgeist where through examining the data an organisation and its members can identify
whereinfacttheoverallmindsetofemployeeslies,whattopicsandissuesarepopularandimportanttothe
overallcommunity,alsoexaminingthephenomenaofinformationasanorganicforce.Thedownfallsofsucha
systemhavealreadybeenidentifiedinthepreviousbenefits.Asanorganicsystemdevelopedontheconcept
thatusersdefinetheirrelationshipswithotherstheorganisationhasnocontroloverwhocanbejoinandleave
knowledge systems. While also examining the systemunder the auspices of a zeitgeist it mayhighlight some
issuesandconceptscontrarytothatorganisationsprinciplesorstability.AsnotedbyHarrington(1991)thatif
organisations comprise solely individuals then an element of unpredictability is brought into the equation
simplybecauseindividualsareindividuals.
Abovethebenefitsandneedsforsuchinformalrelationalnetworkshasbeenidentified,buthowwouldauser
traverseandexplorethisnetwork,whileexpandingtheirown?
Herbsleb et al. (2002) identified users of communication networks simply would not find and contact other
users directly without some reason or association. By harvesting information on already existent relations
betweensystemuserswecanpresentadynamicmaptoallusersofwhotheyareconnectedwith,andinturn
their contacts connections based on topic of knowledge. By providing a mechanism to browse the vast
network of people through association, be it by a person or topic, the fear of and barriers to traversing
organisationalboundariesarediminished.Thebenefitsofwhichbeingthatnetworkingwithothercommunity
workers or other likeminded people creates opportunities for informal support, supervision, advice and
mentoring (Gilchrist, 2004). By identifying other likeminded people through more social identifiable
connections than that of an organisation chart we see a form of people routing. As exists in most
organisationsanemployeescontactsconsistofhisteam(horizontal)anddirectmanagers(vertical),whereas
with this system it is hoped they will have their own network of associates for a number of different topics
vastlyexpandingtheavailabilityofinformationandexperiencefreeflowfromacrosstheorganisationintothat
small physical team unit, with such networks operation effectively in complex situations that are
characterisedbyuncertainty,interdependenceandopportunitiesforinformalinteractionacrossorganisational
borders (Gilchrist, 2004). But, if an appropriate associate does not already exist in a persons network, or is
missingfromit,thissystemwillallowthemtodynamicallyroutetothepeopletheyneedtofindbeitthrough
aspecificcontacttheyalreadyknow,orstraightintoanexistingknowledgesystem.Thisconceptoffindingthe
shortest and best route to information is much like that of IP networking systems, where there are many
possibleroutesavailablewithinthesystem.Thisprojectattemptstoaidtheuserinfindingtheirbestroute.
APPLYING THE INSTANT MESSAGING MODEL
It was the challenge of this project to apply the concepts of relation identification outlined previously and
apply them to a suitable Instant Messaging paradigm. The application of such a knowledge/relationship
identificationsystemlendsitselftoIMnetworks,withinstantmessagingbeenseenasaninformalmethodof
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |12
communicationandasexpressedbyHerbslebetal.(2002)ashavingthewatercoolereffectontopofinstant
communication.DuetotheflexibilityandinformalityofIMnetworks,alongsidetheneedforuserstoexpand
beyond direct contacts, such a communication platform becomes a perfect match for the development of
unstructured knowledge systems. The initial scenario for this was outlined by the Cisco lead whereby a
member of an organisation was assigned an instant messaging account, through which they could add
contacts, and communicate with others. The project then had to identify the best way through which a
membercouldidentifythenatureoftheirrelationshipandconversationswithothers.Itwasagreedthatthis
wouldbedonethroughthemethodoftaggingconversations.
Tagging is a method of categorization, such as defining keywords, and was a term made popular by the
blogosphereandthedevelopmentoftheWeb2.0generation.Technorati(2008)describetagsasasimple
category name. People can categorize their posts, photos and videos with any tag that makes sense. This
projectwillapplythisconcepttoconversations,wherebyauserwillstartaconversationwithanother,andadd
relevant tags to that conversation. The users would then agree on these tags, and with the end of that
conversation, its time, participants and tags would be committed to the data store holding all the relational
informationofthenetwork.
Thismethodofcollectinginformationraisesanumberofissues,suchaswhichIMtechnologycanbeadapted
to deal with such a process, privacy issues, how can user participation in such a voluntary process be
guaranteed.
The Cisco leads indicated the technology used should be the Extensible Messaging and Presence Protocol
(XMPP) described by its overseeing body, Jabber (2004), as providing easy, rapid coverage for all existing
customer IM requirements, plus unlimited future growth opportunities for more sophisticated applications,
suchasvoiceandvideoconferencing,realtimecustomerserviceportals,andmore.Beingbaseduponopen
standardsXMPPlendsitselftoeasyadaptationanddeploymentallowingnotonlyintegrationwithmanyother
systems, but the addition of custom functionality on top of core functions. By developing on such a flexible
platform the project would be able to build in the extra requirements for such a tag gathering system to an
alreadyexistingIMNetwork.
The issue of privacy is consistently raised when dealing with information communication technology in the
workplace, and in many cases is defined on a case by case basis depending on organisations respective IT
policies.ObviouslyIMnetworksalreadyexistandareusedbymanyorganisations;however,thisisusuallyon
thebasisthatitisagenerallyunmonitoredmeansofdirectandfastcommunication.Oncetheuserisaskedto
start defining what their conversations are about, for public dissemination, there may exist a reluctance to
participate,andtothelevelat whichtheywillreportonall conversations.Mulleretal.(2003)identifiedthat
upto25%ofusersofcorporateIMnetworksusedthemforsocialisingwithothers,somethingthoseusersmay
beresistanttoflagwithintheorganisation.Itisthehopeofthisproject,thatsocialcommunicationwithinan
IM network is tagged by the user as well as workrelated material to help in creating rounded and balanced
knowledgesystems.Anotherprivacyissueexistsinhowtagsaboutaconversationapplyforbothusers,soin
turn both users must agree on a set of tags for a conversation. This led to developing a method of social
categorisation, whereby conversation participants must choose tags together, ensuring both are happy with
howtherelationhasbeendefined;thisprocessisoutlinedinmoredetailinchapter4.
Sowhyparticipateatall?Likeanyjobemployeesfindthemselvesadaptingtoandundertakinghabits,suchas
organisingemailsintofolders.Suchanoccurrencemaynowbeseenasthenormbutthiswasnotalwaysthe
case.Itishopedthatbyprovidingthenovelfeaturesofthissystemthatitwillbeadoptedrapidlybyallusers,
consequently becoming a norm within an environment. These novel factors are of course the ability to
socially categorise a conversation in participation with other users, and then use this information to develop
theirownnetworksandthoseofothers,asdescribedearlierinthischapter,throughthevisualisationsystem.
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |13
VISUALISING DATA
Information visualisation as a distinctive field of research has less than ten years of history, but has rapidly
become a far reaching interdisciplinary research field (Chen, 1999). Almost a decade has passed since Chen
made this observation, but fortunately the sentiment still holds true, with hundreds of books, articles, and
studiesonthesubject.Whatisstrikingaboutthetopicofvisualisationisthatwhilethereareanumberofset
methods adopted for displaying certain kinds of information, visualisation as a whole is about exploring and
experimentingwithnewmethodsofrepresentingdatainclearandinformativemanners.
Visualisation presents vast quantities of often unintelligible data to the human eye in a more graphical and
thereforemeaningfulsense.Indirectrelationtothisprojectrelationshipsinorganisationsareoftenvisualised
inorganisationcharts,whereasthereisnorealwaytovisualisethehugescopeoftheknowledgesystemsthat
sitontopofthesenetworks.Asseeninfigure2,asmallsectionofaknowledgesystemcanberepresentedina
Venndiagram.Howeverthisdoesnotscalewellinthatallofthesystemsdatacannotbeclearlydisplayedin
one diagram, which calls for an interactive solution, allowing the browsing of O(tn
2
) possible interrelated
views,nbeingthenumberofusersandtthenumberofpossibletopics.Thisamountofdatadidnoteasilylend
itself to Morenos (1932) proposed display methodology whereby all users were represented as nodes and
theirlocationsoverasetareaweredefinedspatially.
Chens Information Visualisation and Virtual Environments (1999) goes into detail on the three models for
linkingmembersofonlinecommunities,inparticularspatial,semanticandsocialnavigation.Thespatialmodel
is explained by Freeman (2005) in relation to social networking he stated, Network analysts collect and
examinedataonactortoactorties.Suchdatarecordwhoisconnectedtowhomand/orhowcloselytheyare
connected.However,Chen(1999)alsonotestheuseofspatialmodelsinattemptstosupportcollaborative
virtual environments has been criticised as oversimplifying the issue of structuring, or framing, interactive
behaviour. This assumption rings true in reference to this projects model of relational links between users,
where we do not examine the strength or frequency of communication with a contact but rather that of a
contactundertheboundsofatopicorconcept.Basedonthisprinciplevisualisationsinthisprojectwillfocus
onthesemantic(knowledgerelations&theirweight)andsocialnavigationmodels(directcontacts).
Faust(2005)tookamoreadaptableapproachstatingthataffiliationnetworksaretwomodenetworks....The
affiliation relation links collections of entities actors belong to multiple events, and events may include
multipleactors.Thusaffiliationnetworksarenondyadic,withinthiscasetheactorsbeingusers,andevents
beingatopicofknowledge.Intheoryeachlinkwithinoursystemistriadic,containing3arguments;thetwo
involvingusers,andthethirdbeingtherelationstopic,definedbyatag.
Taking the issues of visualising such a complex system into account this project aimed to provide a
visualisation system that would allow the drilling down of data to display appropriate graphics through the
process of specifying 2 of the 3 arguments and representing the possible resulting arguments appropriately.
The resulting visualisations would stem from Morenos (1932) method of displaying users as nodes, but
instead of using the location of these as an indication of the strength and importance of a link, they would
adaptthescaleofthenodestoidentifythisstrengthandimportance.Bydoingsoitishopedtheusersofthe
systemwillbrowsethesystembasedontheweightofknowledgeandcontextbehindeachlinkasopposedto
lookingatthevisualisationasamapofconnectingusers.


V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |14
4. DESIGN
Thischapterfocusesonthedesignofthesystem,specificallywhattheprojectssetouttoachieveandhowin
turn it would do so. As this project was developed in three main sections, the XMPP chat network and
visualisation system, and the database that connected them, as seen in figure 3, I will split this chapter
accordingly.

Figure3:Simpleoverviewofthesystem&itscomponents
The overall system goal wasto provide a solution toallow users to tag IM conversations, store this data and
visualise it in appropriate and innovative ways. Added to this requirement was a necessity to visualise this
wealth of information in a dynamic and fresh form, which required the most up to date information to be
availableatalltimesforimmediateprocessingbythevisualisationsystem.
Theoverallsystemcalledforavastamountofresearchintotherequiredcomponents,withspecificregardto
planning how each of these would then link in with each other. Development of the visualisation system
depended greatly on what was being stored within the database, while development of the database
dependedgreatlyonwhatwasavailablefromtheXMPPsystem.ConsequentlythedevelopmentoftheXMPP
anddatabasesystemsoverlappedslightly.

V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |15
XMPP CHAT NETWORK
The goal of the XMPP Chat network was to build on the already accepted common functionality of instant
messagingsystems,whileaddingtheabilityforuserstotagconversationsintheIMclient,andfortheserver
tointurnprocessthesetagsappropriately.
An XMPP chat client works on the principle that it makes a connection to an XMPP server, through which it
sendsandreceivesallmessages.Theserverhandlestheprocessofdeliveringmessagestorecipients.
THE CHATCLIENT
TherearemanyinstantmessagingclientapplicationsavailabletodaythatoperateonXMPPnetworks,asitis
an open protocol; however, this project called for the development of its own client, as no existing options
wereavailablethatallowedspecificallyfortheinputoftagswithinaconversation.
Concept
This IM client would need to act like any other client,
consisting of a login screen, a main screen showing
contacts and their statuses, and a conversation window.
Uniquelytheconversationwindowwouldneedtohavean
input area whereby a user can tag that conversation. For
the confines of this project it was decided to make this a
text based input underneath the conversation input line,
see figure 4. The IM clients main users will be end users
withinanorganisationcoveringallemployeeswithaccess
totheIMnetwork.Thisisahugescope,andassuchtheclientandinparticular,itstaggingfunctionalitymust
bescalable,easytouse,andnonconfrontationalforusersofvaryingusagelevels.
TaggingfromtheClientPerspective
Initially when a user opens a new conversation with a contact the client sends a message, {newtagset}, to
theserver,identifyingthattheensuingconversationwillbetagableandshouldbedealtwithassuch.
A prominent issue in devising this system was how the participants of aconversationwould agree on a valid
set of tags; should they be agreed upon at the start or end of the conversation, or throughout its duration;
shouldtheybesetbytheinitiatinguser,ortherecipient,orboth.Byconceptualisingmanyscenarios,thebest
methodfortaggingtheconversationsappearedtobelettingbothusershaveaccesstoasharedtextboxthat
wouldupdateontheotheruserswindowwhenchangedbyanother.ThiswouldtakeadvantageoftheXMPP
protocol,andgivebothuserstheabilitytodefinethetagsforaconversation,whilekeepingitseparatefrom
themainchatprocess.Shouldthenoneuserhaveanissuewithtagschosenbyanothertheyhavetheabilityto
changethosetags,ordiscussitwiththeotherviatheconversationwindow.Thetagsareonlyfinalisedwithin
thedatabaseoncethatparticularconversationisterminatedbyeitheroftheusers,ensuringbothhaveagreed
on the set. This concept of a share text box brings yet another novel social approach to the IM application
making it a very usable feature for users. As tagging in this system is not given prominence due to the main
functionalitybeingchat,itwasprudenttoalsoensurethatauserismadeawarethattheotherparticipanthas
addedorchangedtags;thiswouldbeachievedbyashortnoteappearinginthemainconversationwindow.
ThissharedtextboxwouldbeoperatedusingtheXMPPprotocol,sendingcustomXMPPmessagepackets.This
methodwaschosenasitwaseasytofilterbothserverandrecipientandwouldalsobreakopenforrecipients
whowerenotusingIMclientscapableoftaggingconversation.Byupdatingthetagsinaconversation,aclient

Figure4:Earlymockupofchatclient
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |16
would send to the recipient a message like any other conversation containing the new tags, however this
wouldbeprefixedwithtagthis:.Compatibleclientswoulddetectthisasasetoftags,andupdatethetagbox
accordingly.Iftheclientwasnotcompatible,forinstanceamobiledevice,itwouldsimplyshowthesetoftags
prefixed with tagthis: in the conversation window, alerting the recipient user the conversation had been
tagged,ensuringopennessacrossthesystem.Converselyauserwithanincompatibleclientmaythenupdate
the tags by prefixing and sending a tag set with tagthis: like any other message, thus ensuring backwards
compatibilityforotherclients,whileremaininganinvisibleprocessonclientsdevelopedfortagging.
FunctionalRequirements
Thisclientmustallowauserto...
connectandsignintoachatnetwork,
haveandmanagealistofcontactsstoredontheserverwithwhomtheycouldchat,
changetheirpresence(i.e.Available,Away,OnthePhone),
takepartinaconversationwithauser,
agreeupontagsforthatconversationwithauser,
logout
XMPP SERVER
TheserverforthisprojecthadtobeafullycapableXMPPserverwiththeabilityofhandlingallcommonchat
server functionality, while distinctively in this case process tagging across the conversations it handles. In
essence all clients connect through this server, and this server will also connect to an external database
system,asseeninfigure3.
Concept
Theserverwouldneedtobeastandalonesolutionoranadaptationofanexistingservertoensurethecorrect
processing of messages and tags. The servers main users would be system administrators only, so limited
workwouldbeneededinproducingagraphicalsystem;however,theserverisindirectlysupportingalltheIM
clients and therefore all users of the entire system, so it must be a scalable solution that provides all the
functionalityrequired.
The server must be able to instantly identify tagged conversations and update the database as appropriate
ensuringallinformationprevalenttothenetworksknowledgesystemsaredynamicanduptodate.
TaggingfromtheServerPerspective
The server handles all packets sent over the IM
network, and directs them to their correct
destination.
The server primarily will have to identify any new
conversationsessionsstartedbetweentwousers,this
is achieved by a compatible client sending the
{newtagset} message, which the server will
intercept (see figure 5) and log a record of this
conversationinthedatabase,storingtheparticipating
users,andthetimeat whichtheconversation began,
generating aunique idnumber for that conversation,

Figure5:XMPPServerinterceptingmessages

V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |17
cid.
The server also listens (see figure 5) for message packets prefixed with "tagthis:". Upon receipt of such a
packetitwillforwarditontotherecipientasnormal;however,itwillalsoprocessthetagsetandaddittothe
database.Todothistheservertakesthepacket;identifyingthesenderandrecipient,itusesthesedetailsto
findthemostrecentconversationsessioninvolvingthesetwoparticipantsandgetstherelevantconversation
id number. It will then check the database for any tags already stored in relation to this conversation,
removing them, as the new set will inevitably replace the previous. Finally the new tag set is broken into
individualtagsandstoredtothedatabaseinrelationtothatconversation.
FunctionalRequirements
Thisservermust...
handlebasicXMPPfunctionality(chat/presence/contactmanagement),
listenforpacketscontainingcertainkeyterms,
connecttoadatabaseappropriatelystoringconversationandtaginformation

DATABA
The data
partition
filteringo
ensuring
network,
storessu
For the a
XMPPne
i.e.users
Structure
Thisdata
being,us
The User
thissyste
identifica
howthe
The Con
containin
timestam
userstab
The Tags
conversa

Figure6:D
N.B.Theta
ASE
abase for this
ing and expa
ofdatawithin
sufficient li
, the databas
uchasHRdata
auspices of th
etwork,which
sandtheirtag
e
abasewouldc
sers,convers
rs table is sim
em;thisinclud
ationbytheIM
systemcanbe
versations ta
ngauniqueid
mp.Theuseri
ble.
s table holds
ationid.
atabaseDesign.
ableusers_1ism
s system wou
ansion of the
nthisdatabas
ive tagging
e could then
abases,withou
his project a m
istheonlyin
ggroups.
consistofthre
sationsandt
mply a read on
destheusers
Mserverand
eexpanded.
able contains
dnumberfor
dnumbersin
a record of a

merelyamirroro
V i s u a l
uld need to b
entire system
se,soitisesse
information w
be easily exp
utaffectingth
minimal datab
nformationne
eetablesbase
tags,seefigu
nly table cont
sactualname
anIDnumbe
information
thatconversa
the'to'and'
all tags used
ftheuserstable
l i s i n g S o
be a relation
ms data need
entialthatinf
was being st
panded with f
herestofthe
base was requ
eededforthe
edontheinfo
ure6.
taining basic i
forvisualisat
r.Asshownin
on all conve
ation,theidn
'from'column
across all co
simplytoidentif
o c i a l N e t
al database a
ds. The visua
formationiss
tored correct
further inform
system.
uired to store
visualisation
ormationavail
nformation a
ion,theirInst
nfigure6Iha
ersations reg
numbersofth
nsofthetable
nversations,
fythetwosepar
t w o r k s |
allowing for t
alisation essen
toredinawa
tly and effici
mation from m
e the required
ofthenetwo
ablefromthe
bout users ne
tantMessagin
aveaddedafe
istered by th
hesendingan
erelatedirect
recording eac
ateonetooman
F i n a l R
P a
the successfu
ntially works
ayconducivet
ently from t
more stable s
d information
rkKnowledge
eXMPPnetwo
eeded for ope
nghandle(add
ewextrafield
he server, wi
ndreceivingu
tlytotheidfi
ch tag and th
nylinks.
R e p o r t
a g e |18
ul storing,
from the
tothis.By
he XMPP
state data
from the
eSystems,
ork,those
eration of
dress),for
stoshow
ith tuples
useranda
eldinthe
he related

V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |19
Theuserstableusesthefield'uid'toholdanautomaticallygeneratedinteger,andusesthatastheprimarykey
fortheuser.Theusershandlecouldhavealsobeenusedasaprimarykey,butasthesystemneededtoscale
storingstringhandlesinthetoandfromfieldsintheconversationstablewouldbelessefficientthanstoring
integerswhentakingintoaccountthesizetowhichtheconversationstablecouldgrow.
Expansion
This Database as mentioned earlier would need to be flexible allowing for further visualisation development
based on usage data from the XMPP network, such as visualising connections that exist between regional
offices. By keeping the users table separate from the rest of the system, in that it is not writable to by the
XMPPsystem,wecanfurtherexpandittoincludeanyextradataneededforvisualisations.Thatextradatacan
beofcourseimportedfromotherdatabasesornewtables.
Thisdatabasedesigncanalsostoreinformationongroupconversationswithoutchangeifneeded.Thiswould
beaprimitiveimplementationofstoringgrouptagginginformation,buttodoso,asetofconversationsrecord
would be made between each respective member of a group conversation and the other participants, and
whentagswereappliedtothegroupconversationtheywouldbeappliedtoallindividualconversations.This
in essence would be the same as n! conversations taking place, n being the number of participants in the
group.Moreeffectivemeansforstoringsuchinformationexist,butthisprojectsscopeliesbetweentwousers
forthetimebeing.

V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |20
VISUALISATION SYSTEM
Allmethodsofvisualisationconsistorrepresentinginformationaswholeorasubsetinagraphicalmeans.This
projects aim was to not only represent data from the system but to allow dynamic filtering of that data
through the visualisation itself. Only two visualisation concepts were defined in the Project Definition
Document,withanaimtoproducefurthervisualisationsdependantontheavailabilityoftime.
A requirement of all these visualisations was that they be both intuitive and dynamic. All the visualisations
needed to be designed in such a way that they were clear to use and digest by any level of user, be it a top
level CEO or a new hire, concurrently all visualisations needed to be produced from the latest available
informationfromthedatabase.
Adding to these requirements was the need to make the visualisations available across different platform, in
an easy and accessible fashion. Naturally a web basedframework was chosen, allowing access to the system
viaauserswebbrowser,alsohavingthebenefitofallowingeasydisseminationofthelatestinformationinto
thesevisualisations.
THE VISUALISATIONS
TagCloud
Atagcloudismethodofdisplayingasetofwordsinacollection,andchangingthesizeofeachofthosewords
to represent the frequency of that word (or its weighting, discussed below) within that set. The tag cloud
conceptispopularacrosstheblogosphere,byfarthebiggestusersoftagging.Theinformationavailablefrom
the database lends itself successfully to such a method of representation, and can be quickly and easily
represented through all web browsers. See Appendix 1 for a tag cloud representation of the words in this
projectsProjectDefinitionDocument.
As a tool, the tag cloud would provide a system wide analysis of all terms or users, allowing a simple 1
dimensional exploration of all users or all tags simply by clicking a term to filter the data collection
respectively.
AnimatedFilteringVisualisation
A style of visualisation based on the graphical trampoline system used to explore emails from the ENRON
collapse in the United States (Trampoline Systems, 2005), was required for this project. By researching
previous method of data visualisation and concepts of representing relationships spatially and semantically
(Chen,1999),itwasclearaslightlyuniquesolutionwouldberequired,asdiscussedinchapter3.
The concept was to allow a user to choose a start point for the visualisation, be it a person or a tag. Upon
clickingthisstartingnode,fromitmanyothernodeswouldemerge.Ifitwasapersontostart,tagsthatperson
useswouldemergeasnewnodesaroundthatperson;ifitwereatagtostart,thepeopleusingthattagwould
emerge as new nodes around it. Thesenew nodes would be sized according to their weight, with the largest
node being the most relevant. This demonstrates one level of filtering the information available; being a
nondyadicsystemthesystemmustgoonbeyondthisandallowforfurtherleveloffilteringwhendealingwith
people.Bythenclickingoneofthechildnodes,thecurrentparentnodewouldmoveaside,theclickednode
wouldbecometheactivenodeandfromitwouldemergetherelevantchildren.
For example, start with Bob.Clicking on Bob reveals the tags Bob likesto talkabout. Clickingonthetag Java
movesBobtotheside,asheisstillapartofthisfilter,andmovejavatothecentreofthevisualisation.From
Java emerge all the people with whom Bob talks about Java with. This process is outlined in table 1 and
graphicallyinfigure7
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |21
Parent Active Children
Bob Java,Html,XML,WebDesign
UserClicksJava
Bob Java Sam,Mark,James
UserClicksSam
Sam Java,Project,XML
Table1:DemonstrationofVisualisationProcess

Figure7:GraphicDemonstrationofVisualisation
THE HOST
Thewebhostforthesystemconsistsoftwomainsubsystems,apresentationlayerandabusinesslayer.The
presentationlayerwouldpresentawebsitedisplayingthevisualisationsrequiredbytheuser,allowingthemto
choose the starting points before loading visualisations, and providing the data needed by the visualisations,
suchastermsandweights.
Weighting
Weightingoftheinformationwithinthesystemwouldprovetobeoneofitsmorecomplex,requiringresearch
andfinetuningtoidentifyanappropriatemethodforweightingterms.
The success of each visualisation mentioned above relies on weighting. Without giving the individual nodes
(words) within the visualisation a weight they would all remain the same size, delivering no value to the end
usersofthesystem.
Tocalculatetheweightusedforscalingthenodesanumberofcalculationsmustbeperformedonthegroup
oftermstobedisplayed.
Eachtermisgivenaninitialweightbasedonitsfrequencyoverthepast12months;scalingfactorsareused,
giving the most recent months prevalence. The reason for deriving a value from the past year ensures that
even if a major subject to a user at one point in time, hasnt been tagged in conversation by the user for a
periodoftimethesystemwillnotdisregardit.
Theformulaisasfollows...
Bob
Java
HTML
Web
Design
XML
Java
Sam
Mark
Bob
James
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |22
w
t
= _ IF(t, m) - S
m
12
m=9
_ + _
IF(t, m)
8
m=1
8
- S
9
_
S
12
= 1; S
11
= u.2; S
10
= u.uS; S
9
= u.u1
WhereW
t
istheweightoftermt;TFisafunctionreturningthefrequencyoftermtinmonthm;withSbeinga
predefinedscalingfactor.
Thisformulareturnsascaledfrequencyforatermbasedonitsoccurrencesoverthepastyear.
Thedatamustthenbeprocessedthroughanumberofstepstoproduceaweightingvalueappropriateforuse
in visualisations, ideally a value between 0 and 1, which allows the identification of a scale between the
minimumandmaximumscalebeingusedinthatvisualisation.
Oncewehaveasetoftermsandtheiradjustedfrequenciesfromthepastyearwecanprocesstheterms.
Thefollowingisbasedonasetofterms,eachwithaninitialfrequencyfq.Figure8showsasamplesetofterms
beforebeingprocessed.

Figure8:FrequenciesbeforeProcessing
Step1:HarshNormalisation
Initially we normalise the array of frequencies reducing spikes and bringing all values to a range < 1. This
methodwasadaptedfromtheworkofTeknomo(2007)invisualisation.Itusesascalingfactor(SF)tovarythe
amountofchange.
SF = Hox(Hox(q), 1uu) - 1u
Nw

= _
q

(q

)
2
+SF
_
WhereNW
i
isthenormalisedfrequencyoftermi.Theresultsofthisstepasappliedtotheexamplesetshown
inFigure8aredisplayedinFigure9
1 2 3 4 5 6 7 8
AdjustedFrequency 1.65125 1.6525 100 1.65375 30 10 2.2075 0.6025
0
50
100
150
F
r
e
q
u
e
n
c
y
Term
AdjustedFrequency
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |23

Figure9:Frequenciesafternormalisation
Step2:DataSmoothing
Thisprocesssmoothsthespreadofthedata,removinglargejumpsandscales,bringingamoreevenbalance
totheset.ThissmoothingimplementationismodelledonthatofBrignell(2003).
Sw

= 1 -(u.S - (1 -Nw

))
WhereSW
i
isthesmoothednormalisedfrequencyoftermi.Theresultsofthisstepasappliedtotheexample
setareshowninFigure10

Figure10:Frequenciesaftersmoothing
Step3:Transformtoarangebetween0and1
This final step involves transforming the range allowing for easier manipulation of the data. To perform this
step we need the maximum and minimum values of the set of terms smoothed normalised frequencies. To
ensurethesystemdoesnotencounteranyproblemswhereasetofnumbersareequalandresultinadivide
byzeroerror,wemultiplytheminimumvalueby0.9999.
w

=
Swi -(Hin(Sw) - u.9999)
Hox(Sw) - (Hin(Sw) - u.9999)

Where W
i
is the processed weight of term i lying between 1 and 0. The results of this step as applied to the
examplesetareshowninFigure11
1 2 3 4 5 6 7 8
HarshNormalisation 0.052146067 0.052185434 0.953462589 0.052224801 0.688247202 0.301511345 0.069637811 0.019049266
0
0.5
1
1.5
F
r
e
q
u
e
n
c
y
Term
HarshNormalisation
1 2 3 4 5 6 7 8
DataSmoothing 0.526073033 0.526092717 0.976731295 0.526112401 0.844123601 0.650755672 0.534818906 0.509524633
0
0.5
1
1.5
F
r
e
q
u
e
n
c
y
Term
DataSmoothing
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |24

Figure11:Finalweightsasderivedfromtransformation
Thederivedweightingisthenusedbytheindividualvisualisationstoscaletheirrespectiveelements.

1 2 3 4 5 6 7 8
Transformtoarangebetween0 1 0.035525056 0.035567182 1 0.035609308 0.716200037 0.302364237 0.054242509 0.000109046
0
0.5
1
1.5
F
r
e
q
u
e
n
c
y
Term
Transformtoarangebetween0 1
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |25
5.TECHNOLOGIES
This chapter outlines the technologies that were used in the development of this project. It is interesting to
notethatallthetechnologiesusedinthisprojectbarFlashareinfactopensource.
XMPP
The Extensible Messaging and Presence Protocol (XMPP) is the IETF's
formalization of the base XML streaming protocols for instant messaging
and presence developed within the Jabber community starting in 1999
(XMPP Standards Foundation, 2007). XMPP provides a platform that is
bothopenandflexible,allowingdevelopmentofcustomfeaturesontopof
its robust layer of instant messaging features. Being an open protocol, an
evolution of the Jabber protocols, XMPP enjoys a sizeable market share
when it comes to IM providing technologies. In 2005 Google launched its
ownIMservice(GoogleTalk),basedupontheXMPParchitecture.
As a means of communicating data, the protocol defines the method by
which XML is streamed across a network. By enveloping defined XML
stanzas, containing information on presence, messages and custom
features,withina<stream/>node.
Servers and clients operating on the XMPP protocol on a high level view simply have to be able to construct
andunderstandtheseXMLenvelopestobeoperational.
JAVA
TheXMPPClientwillbedevelopedintheJavalanguageallowingforportabilityacrossdifferentsystems,and
theabilitytotakeadvantageofstableGUILibraries.
SMACK ANAPI FORDEVELOPING JAVAXMPPCLIENTS
Smack(JiveSoftware,2006)isanopenJavaXMPPclientAPIreleasedbyJiveSoftwareundertheiropensource
arm,IgniteRealtime.AsdescribedontheprojectpageSmackisApureJavalibrary,itcanbeembeddedinto
your applications to create anything from a full XMPP client to simple XMPP integrations such as sending
notificationmessagesandpresenceenablingdevices(JiveSoftware,2006).
ThisAPIdealswiththebasicoverheadfunctionsofanXMPPClientsuchascreatingandmanagingconnections
toaserver,muchlikeasocketmanager,alsoassistinginthecreationoftheXMLenvelopesdiscussedearlier,
allowinglesstimetobespentdevelopingwaystocommunicatewiththeserver,andactuallytakingadvantage
ofintegratedconnectionswiththeserver.
OPENFIRE XMPPCHATSERVER
The OpenFire Server (Jive Software, 2007) is also developed under Ignite Realtime and provides a fully
functionalopenJavabasedXMPPserver.Tocomplementthisstableplatformsopenarchitecture,itallowsfor
the development and use of plugins. By using an IDE such as eclipse it is possible to create a development
environmentfortheserver,andinturn,itsplugins.AdevelopersguideisavailablefromJive(Openfire:Plugin
DeveloperGuide,2007).
|--------------------|
| <stream> |
|--------------------|
| <presence> |
| <show/> |
| </presence> |
|--------------------|
| <message to='foo'> |
| <body/> |
| </message> |
|--------------------|
| ... |
|--------------------|
| </stream> |
|--------------------|

Figure12:XMLEnvelope
AsseenintheXMPPRFC(Jabber
SoftwareFoundation,2004)
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |26
Developingapluginforthisserverwouldallowfortheseamlessintegrationoftheserverlistenerneededfor
handlingtags.
MYSQLDATABASE
MySQListheindustrystandardOpenSourcedatabase.ItisastandardSQLdatabaseserverprovidingimmense
dataflexibilityandstability,whileremainingfree.MySQLontopofbeingpowerfulisalsoarelationaldatabase
allowing queries to span multiple tables, and where this comes strongly into play is how it quickly handles
multiplenestedrelationalquerieswhencollectingdatafrequenciesforatermoverthepastyear.
TheadvantageofusinganySequentialQueryLanguagedatabaseisthatdatabecomesportableandisnottied
tooneplatform,withSQLqueriesoperatingsimilarlyonallSQLofferings.Thismeansthatifthesystemwere
tobeadaptedtositontopofadifferentdatabase,onlythedatabaseconnectorswouldneedchanged,leaving
therestofthecodeunaffected.
APACHE WEB SERVER
Apache is by far the most popular and most robust web server available to date. It is an open source web
server that runs upon both Linux and Windows platforms, allowing it to be deployed in any number of
configurations.
AnotheradvantageoftheApachewebserverisitsalmostseamlessintegrationwithPHP.
PHP
PHP is the Hypertext Preprocessor, a server side scripting language that allows for quick and powerful
programmingwithinhtmlfilesbeforetheyleavetheserver.PHPwaschosenastheserversidelanguagedue
tomyownpastexperiencewithitanditsreliableintegrationwithApacheandMySQL.
PHPhastheadvantageofbeingcustomdevelopedtolinkinwiththeApachewebserveramongotherswhile
also having different versions for the various different operating systems upon which it runs, giving it the
addedbenefitofbeingextremelyfast.
PHP will be used primarily for collecting data from the database, calculating weighting of sets (as described
above), and then eitherproducing tag clouds or customXML files. PHP will also be used to createa script to
floodthedatabasewithautomaticallygeneratedrandomdata,tocreateatestbedforthesystem.
FLASH
Adobes Flash is the lead development tool available that closely combines the needs of a developer with
those of an animator. By providing a strong and experienced set of tools for both the designer and
programmerparadigmatopanewObjectOrientatedversionoftheFlashActionScriptLanguage,Flashwasan
immediatechoiceforthisproject,andthedevelopmentofitsclickthroughvisualisations.Notonlydoesflash
havethebackingofnearlyadecadeofwebapplicationsunderitsbelt,buthasoneofthewidestdistribution
baseswithareported99%flashinstallationbase,(Adobe,2007).
Due to the credence and longevity of Flash development there exist many tutorials and resources online to
helpwithmasteringthelanguageanddesigntechniques.ToaddtothesebenefitsthelatestversionofFlashs
Action Script Language, AS3 has been tuned to work seamlessly with XML data, allowing easy input and
processingofvastamountsofdata,suchasthatfromthevisualisationsystem.
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |27
When researching this project another new technology was beginning to make inroads into the web based
visual applications market, specifically Microsofts Silverlight (Microsoft, 2007). The technology did show
promisingsigns,washoweverstillrelativelynew,undocumentedandlockedusersintoWindowsplatforms.
XML
HavingaminimalnumberoffeaturesinXMLfacilitateswritingprocessorsthatcanhandlevirtuallyanyXML
document, making XML documents universally interchangeable. (Young, 2002). XML the next functional
progression of SGML after HTML is a markup language dedicated to adaptability, allowing complete control
overthecontentandstructureofadocumentthroughdoctypes.
XMLwouldnotonlybeusedbytheXMPPsysteminitstransferofdataasseeninfigure12,butwouldalsobe
generated by the PHP weighting system, allowing for the semantic display of the systems resulting data and
consequentlyfordatainputintotheFlashvisualisations.


V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |28
6.IMPLEMENTATION
Thischapterdelvesintotheimplementationofthesystem,examinedunder5mainheadings,Client,Database,
XMPPServer,WebServerandFlashDevelopment.
CLIENTDEVELOPMENT
Client development was split into two main sections, the business logic, and the GUI. Consequently this was
thestructureoftheapplicationspackages.
Fordevelopment I usedEclipse IDE for all coding, with the assistanceof theNet Beans IDE in developing the
GUI.Tobegin,IneededtodownloadandregistertheSmackAPIlibrariesintheapplicationsJavaBuildPath.
The business logic consisted of two classes, the conductor.class and the user.class. The purpose of the
conductorclasswastoconducttheoperationsoftheapplication,primarilybyusingthetoolsprovidedbythe
Smack API to create a connection with XMPP server, handle logging in and out of the system, setting up
listenersfornewconversationsandstatusupdatesreceivedfromtheserver,andtakingtheappropriateaction
fortheseevents.Theseactionsoftheselistenersleadtotheconductorclassssecondaryfunction,tomanage
theGUIwindowssuchas,showingandupdatingcontactlistsandcreatingnewconversationwindows.
TheUserclassstoresinformationonthecurrentuserloggedintotheclientafterasuccessfulconnectionhas
beenmadewiththesystem,alsoholdingacopyofthatuserscontactslist,andtheirrespectivestatuses.

Figure13:LoginScreen

Figure14:ContactsScreen
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |29
Developing the GUI components involved linking with both the conductor class and again the Smack API, for
instanceinachatwindowtakingausersinputandsendingittotherecipient,andinreturntheconductorclass
receivingnewmessagesandupdatingtheappropriateconversationschatwindowwiththatmessage.
The Clients GUI wouldconsist of3 main screens, loosely based on otherpopular IMclients such asWindows
LiveMessengerandGoogleTalk.Thesescreensbeing,theLoginScreen(figure13),theContactsScreen(figure
14) and the ConversationWindow(figure15).Ofparticular note is theTag Box inthe ConversationWindow,
sitting just underneath the main conversation input box, where a user can enter their preferred tags for the
conversation, and hit return, at which point the prefix tagthis: is affixed to the contents of the box and is
sentasamessagepacket.

Figure15:ConversationWindow
TotesttheclientthroughdevelopmentIinstalledtheOpenFireServerandranitinitsbasicoperationalmode,
preconfiguringanumberofuseraccountsontheservertoallowtestingofcontactsandpresencefeatures.
DATABASEDEVELOPMENT
Development of the database consisted of taking the designed schema (figure 6), identifying the required
variabletypesandcreatingrelevantSQLStatementstobuildthedatabase.Thesewereasfollows...
CREATETABLE`conversations`(
`cid`int(11)NOTNULLauto_increment,
`from`int(11)NOTNULL,
`to`int(11)NOTNULL,
`time`timestampNOTNULLdefaultCURRENT_TIMESTAMP,
PRIMARYKEY(`cid`)
);

V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |30
CREATETABLE`tags`(
`cid`int(11)NOTNULL,
`tag`varchar(255)NOTNULL,
PRIMARYKEY(`cid`,`tag`)
);

CREATETABLE`users`(
`uid`int(11)NOTNULLauto_increment,
`handle`varchar(255)NOTNULL,
`name`varchar(255)NOTNULL,
`location`varchar(255)NOTNULL,
`dept`varchar(255)NOTNULL,
PRIMARYKEY(`uid`)
);

FollowingcreationofthedatabaseIloadedtheuserstablewiththedetailsofusersIhadcreatedontheXMPP
server,thuspreparingthedatabasetoreceivedatafromtheIMNetwork.
XMPP SERVER PLUGINDEVELOPMENT
UsingthePluginDevelopmentGuide(JiveSoftware,2007),Iwasabletocreateaverybasicpluginthatwould
listentoandactuponallpacketsbeingreceivedbytheserver.BuildingonthisIwouldfilterincomingpackets,
listeningonlyforthoseindicatingnewconversationsoranewsetoftagsforaconversation.Eachofthesetwo
eventswouldtriggeranevent.
If the packet indicated a new conversation between two users (the packet contained {newtagset}), the
senderandrecipientwouldbeextractedfromthepacket,andcheckedagainsttheSQLdatabasetogettheir
useridnumbers.Anewrecordwouldthenbecreatedintheconversationstableofthedatabaseconsistingof
thetwousersids,atimestamp,andanautomaticallygeneratedidforthatconversation.
If the packet described a new tag set for a conversation (the packet started with tagthis:), the sender and
recipient would be extracted from the packet, and checked against the SQL database to get their user id
numbers.Thesewouldthenbeusedtoidentifythemostrecentconversationrecordinvolvingthesetwousers
in the database, returning that conversations id. Any previous tags recorded for that conversation in the
database are deleted and the newly received tag set is separated into individual tags and added to the
database.
Oncecompiledthepluginissimplyloadedontotheserver,viatheservercontrolpanel.
Due to this plugin being stateless and simply taking action on packets as they were received it did not have
anyeffectonthespeedofoperationorfunctionsoftheXMPPserver.
WEB SERVER DEVELOPMENT
The web server used to host the visualisations would be an Apache server running PHP. The main
requirementsofthissystemweretonotonlyhostthevisualisations,buttoprocesstherawinformationfrom
thedatabaseusingtheweightingschemesoutlinedearlier.Withtheresultsofthisprocessthesystemwould
either create visualisations in the form of tag clouds on an html document, or provide an xml document
readablebytheflashsystem.
At the core of this system were the PHP functions to collect information from the database and process it
accordingly,basedonasetoffilters.Thesefilterswouldbeaword,beingausersnameoratag,andwhatkind
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |31
of term that it is, i.e. a tag or a user. In certain cases the filter may contain both a tag and a user, when
determiningwhomausertalkstoaboutacertaintopic.
SQLCollectionQueries
Intotalforeachnewvisualrepresentationtobemadebythesystem5SQLquerieswouldbeundertaken.Each
of the 5 queries would return information from a specific time period in the previous year. For example, the
querybelowreturnsthefrequencyofalltagsfromtwomonthsprevious.
SELECT tag, COUNT(tag) AS quantity
FROM tags, conversations WHERE
tags.cid = conversations.cid AND
conversations.time < DATE_SUB(CURDATE(),INTERVAL 1 MONTH) AND
conversations.time >= DATE_SUB(CURDATE(),INTERVAL 2 MONTH)
GROUP BY tag ORDER BY tag ASC
Byapplyingthese5queries,thesystemcollectsalltherawdataneededbytheweightingschemesinChapter
4.
WeightingFeature
Asoutlinedinthedesignchapter,thesystemrequires5setsofinformation,rangingoverthepreviousyear,to
produce a definitive weight for each tag or user to be represented. After performing the SQL queries, the
results are multiplied by that time periods scaling factor, and passed as an array into the normalisation
function. This function applies the transformations outlined in chapter 4, and returns an array of the
tags/users along with their calculated weight. This informationcanthen beusedby any visualisationprocess
todisplayadefinedcollectionappropriately.
TestCollectionCreation
To efficiently test the abilities of the database, and the processes outlined above, I would require a vast
datasetonwhichIcouldapplytheseretrievalsandtransformations.Withthetimescaleoftheprojectitwas
notpossibletoretrieveinformationoverayearlongtimeperiodfromatestbaseofusers.Toovercomethis
difficultyIproducedascripttogeneraterandomconversationinformationandstoreittothedatabase,asifit
hadactuallyoccurred.
This sample dataset would consist of over 1000 conversations. This set was generated by a loop performing
SQL queries creating conversation records, each containing a randomly selected sender, randomly selected
recipient,andarandomlygeneratedtimestamp,existingwithinthepreviousyear.
After conversation recordshad been created it was time to then apply random setsof tags to these. For the
purposeofthisprocessIpredefinedalistof30tagsthatcouldbeappliedtoconversations.Thescriptthenran
through 2000 iterations randomly selecting an existing conversation and a tag from the predefined list, and
storingthesetothedatabase.
Aswithanyrandomgenerationbasedscripts,thedatabasewasleftwithafairlybalancedspreadofdata,this
didnotlenditselftovisualisation.Tocounterthis,Ithenretreatedthroughthelistremovinganyconversations
andtheirtagswhichcontainedasenderwithanevenidnumber,thuscreatingamorefracturedcollection.
TagClouds
Withanarraycontainingtagsorusers,andtheirassociatedweights,itwasnoweasytoapplyvisualisationsto
this data. The first and easiest method by which the collection could be visualised could be performed from
within PHP, due to its ability to be performed within HTML. The PHP script for this process would take
arguments passed into the URL and then display all the relevant tags or users on the browser in an
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |32
unstructuredlistofwords,howevertogivethisrepresentationsomevalue,itwouldgiveeachoftheseitemsa
different font size. A minimum and maximum font size was chosen, and the item in question would have its
weighting transformed into that size range, resulting in an appropriate font size being applied to it. The font
sizeofatermiwascalculatedbythefollowingequation.
sizc

= ((HoxFontSizc - HinFontSizc) - wcigbt

) +HinFontSizc
Forthepurposeofthisvisualisation,Ialsoappliedhyperlinkstoeachoftheseterms;thisallowedtheuserto
regeneratethetagcloudbasedonasimplefiltersuchasalluserstalkingaboutthistag,oralltagstalkedabout
bythisuser.
XMLGeneration
Toallowthevaluableweightinginformationgeneratedbythissystemtobefurtherutilisedformoregraphical
interpretation, namely by Flash, there was a need to generate an XML document containing the information
required such as terms and weight. This being the method by which Flash communicates with PHP most
effectively, and the only method by which it can access information from an SQL database, not containing
native SQL support. The XML file wouldbe generateddynamically based on arguments passed intothe URL,
allowingFlashtorequestfiltereddata.
TheXMLfiletookthefollowingform.Whereeachnodewouldrepresenteitherauseroratag,basedonthe
informationrequestedintheURL.
<nodes>
<node>
<name>Sam</name>
<weighting>0.851880135449</weighting>
</node>
</nodes>

Figure16:Anoverviewofthewebserverenvironment
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |33
FLASH DEVELOPMENT
The visualisation to be developed within flash would be developed as a display of circular nodes, each
containing a tag or user name. These would then be clickable to allow the display of child nodes showing
furtherrelatedtagsandorusers,asbrieflydescribesinfigure7.
Nodes
Theconceptofthisvisualisationwastostartwithanodeinthecentre,asingleuserortag.Byclickingonthis
node,for example, a user,a new set of nodes would be created around it, showing the tags whichthat user
talks about. These new nodes would be different sizes, representing the weight and relevance of the tag in
question. Like the tag cloud system, this scale would be determined based upon a minimum and maximum
size.Byclickingononeofthesechildnodes,thescreenwouldthenclearwiththeoldcentralnodemovingto
theside,andtheselectedchildnodenowtakingcentralfocus.
Placement
Toevenlydisplaythechildnodesaroundacentralnode,anumberofmathematicaloperationshadtobefirst
performed to identify their location on the screen, ensuring the visualisation was spaced both evenly and
amply. In essence the central node would exist with a number of child nodes around it, each of which are
spaced evenly and at a fixed distance from the central node. Knowing a set number of children must be
displayed,acircleof360canbedividedintoisegmentseachwithanangleofz.Thecentralpointsofthese
childrencan then be determined through trigonometric functions such as Sin and Cos. These willthen give X
andYcoordinatesoftherange1and1,asseenin
figure16.
We must then transform these coordinates to a
specific distance from the central node on the
visualisation. The minimum distance required can
be represented as the radius of a larger circle
upon the circumference of which the centre
pointsofthechildnodeswilllie.Forichildrenwe
calculate their size from the terms weight to
determine their diameter and thus, the larger
circumference.
circumcrcncc = Jiomctcr

n=0

By calculating the minimum required
circumference for this circle, using we can
determineitsradius,asoutlinedbelow...
roJius = ((circumcrcnccn)2)
Thisresultmayindeedcreatearadiusthatistoostringentandthereforeacluttereddiagram,tocombatthis
we add the maximum radius of a child node to this radius, allowing for further spacing. Nonetheless the
visualisation may still be very small in relation to the available space, so we determine the radius to be the
maximumofitascalculatedaboveoravaluesetinrelationtothesizeofthevisualisationsdrawingspace.The
coordinates of a point are then defined by the trigonometric functions Cos and Sin and are set in relation to

Figure17:UsingTrigonometrytodeterminecentrepoints

V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |34
thecentralnodescoordinates.ForacentralnodeC,thecoordinatesofchildnodeEofZtotalchildren,would
becalculatedasfollows...
E
X
= C
X
- (cos _
S6u
Z
] - roJius)
E

= C
Y
-(sin_
S6u
Z
] - roJius)
Prototype
AfterdevelopingamethodbywhichtoimporttheXMLdocumentcontainingtermsandweights,anddevising
a way by which to display them around a central point, a basic prototype was developed. This would not
involveanymovement,simplythecreationofnewchildnodestoshowthedeterminationofpositionandscale
inaction.Byclickingonachildnodealltheothernodesonthevisualisationwoulddisappear,withtherelevant
new children being drawn around that child, in its current location. After mastering how Flash creates and
destroysobjects,Iwasabletothenfocusonmovingandanimatingtheprocess,inparticularmovingaclicked
nodetothecentreofthevisualisation,asopposedtostayingfixedinoneposition.
Animation
Toaidtheenduserwithunderstandinghowdatawasvisualised,andhowtheirbrowsinginturnaffectedthat
visualisation it was necessary to animate the process. This would consequently add a more useable novel
factortothesystem.
With reference to figure 18 when a user clicks on
node A the children B, C & D are created. To show
theirrelationtoA,theywillspringoutfromA.
IfAdisplaysauserthenthechildrenB,C&Ddisplay
tags. If D is then clicked, the other tags C & B will
disappear,Awillmovetothesideofthevisualisation
while D will move to the centre, and its children will
then spring from it. By moving A to the side, the
visualisation shows that user A talks about tag D to
theusersnowcirclingtagD.
If A displays a tag then the children B, C & D display
users who talk about that tag. If user D is then
clicked,theotherusersC&Bwilldisappear,alsothis
time, node A will disappear, with the user D moving
to the centre and becoming the focus of the
visualisation.
The movement and animation of an object in Flash is a process known as tweening, to perform this action
youmustspecifyanobjectsstartstate,itsfinalstate,andthetimeandmethodbywhichthegapbetweenthe
twostatesshouldbebridged.

Figure18:Theflowofmovement


V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |35
ExtraVisualisations
Being ahead of schedule with the project it was possible to create
an additional visualisation in Flash. Using the code already
developed to display a number of nodes around a centre point, I
decided to create a map of all users and their contacts. By simply
displaying a node for each user around a circle and connecting
those nodes with lines a map is created of how communication is
conductedacrossanetwork,asseeninfigure18.
To achieve this I developed a new XML file generated dynamically
by PHP to feed in a list of all users and their related contacts. The
structureofthisfilewasasfollows
<users>
<user>
<name>Adam</name>
<contact>Alan</contact>
...
<contact>Stephen</contact>
</user>
<user>
<name>Admin</name>
<contact>Ciara</contact>
...
<contact>Tim</contact>
</user>
<users>
Byhighlightingonlyacertainusersconnectionswhenselected,madethisasimpleyeteffectivevisualisationof
data.

Figure19:Usersandtheircontacts


V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |36
7. RESULTS
Calculating the weights of terms, be they users or tags, was the most computationally expensive process
within the system in its entirety. By ensuring that the database and code used to process the returned data
waswellstructured,thesystemreturneditsresultsquickly.Bythenapplyingthesecalculatedweightstoatag
cloudorflashvisualisation,thebenefitsofnormalisationofthenumberscanbeseen,inthesmoothscalingof
sizetorepresentimportance.Thisprocessworkswelloneither,largesetsofdata,orinturn,smallones.
Table 2 below shows the adjusted weights of all users who talk about the tag Project, and figure 20 shows
how this information was visualised in a tag cloud, using the weight as a percentage of size. The process of
collecting this data from the SQL database, processing and calculating the weights and then outputting the
resultstoHTMLasatagcloudtook0.0687291622162seconds.
User AdjustedFrequencyoverpast12months NormalisedWeight
Ciara 2.10125 0.851880135
Sean 2.46 1
Adam 0.4625 0.171899406
Admin 1.3075 0.523057634
Andrew 0.3025 0.105338999
Chris 0.55125 0.208813616
Ciaran 1.15625 0.460265812
David 0.55125 0.208813616
Deborah 0.25625 0.086096865
Eoin 0.30375 0.105859046
Hugh 0.5525 0.209333498
HughMelvin 0.4025 0.146940703
Joanne 0.51125 0.192176841
John 0.3025 0.105338999
Lisa 0.765 0.297695876
Marty 0.20375 0.064253623
Matthew 0.71875 0.278467303
Michael 0.51375 0.19321667
Tim 0.3 0.104298903
Anthony 0.6525 0.250920379
James 0.085 0.014843997
Liam 0.105 0.023165779
Padraic 0.105 0.023165779
Paul 0.0625 0.005481941
Sally 0.155 0.043969985
Stephen 0.0525 0.001321015
Table2:FrequenciesandweightsforusersusingthetagProject

Figure20:T
Whenap
represen
User
Ciara
Sean
Admin
Table3:Fre
Figure21:T
Thisproc
notonly
from.As
machine,
XMLdoc
fordispla
Cloudpa
experien
thatdata
Figure22
displayin
Ascanbe
renderst
nodeisb
themous
TagCloudforthe
ppliedtoatag
ntingthetagd
equenciesandw
TagCloudforthe
cessthentran
becameclear
theFlashvisu
,theonlydela
umentfromt
ay.Essentially
ge,makingth
cedwerethe
ainawebbro
2showstheFl
gtheresultin
eseeninfigur
thenodeunre
broughttothe
seleavesthat
eterm'Project'
gwithamuch
demo.Calcul
AdjustedF
weightsforusers
eterm'demo'
sferredoverw
rerwhatinfor
ualisationwou
ayexperience
heserver,the
ythegeneratio
heFlashvisual
transferofth
owser.
lashvisualisat
gvisualisation
re22,somen
eadable.Toco
efrontofthev
tnode.
V i s u a l
hsmallerthe
ationforthis
Frequencyov
1
1.4
0.4
usingthetag'de
welltotheFla
mationwasb
uldrunwithin
edinbrowsing
etimetakent
ontimeofthe
lisationmuch
heXMLdocum
tionsdisplayo
nfortheterm
odesinthedi
ombattheseis
visualisation,
l i s i n g S o
processstills
tagtook0.06
verpast12m
emo'
ashvisualisati
beingrepresen
theFlashpla
gtheinformat
togenerateth
eXMLdocume
fasterandth
mentacrossth
ofthetagPro
mdemo.
isplaywillove
ssues,whena
anditssizeis
o c i a l N e t
scaleswell.As
87291622162
months
ons,aswitht
nted,butwhe
yerthrought
tionwasthat
hisandthenp
entwasslight
ereforerespo
henetwork,w
ojectandwho
erlapothers,w
amousehove
temporarilye
t w o r k s |
sseenintable
2seconds.
Norm
0.60
0.003

heirmoveme
erethisinform
hebrowsero
ofthetimeta
assitbackto
tlylessthanth
onsive,asthe
withoutthede
otalksabouti
whilesomeno
rsoveranypa
enlargedtom
F i n a l R
P a
e3andfigure
malisedWeig
01614241899
1
3202584907
ntandanima
mationwasco
ntheenduse
akentoreque
theusersfla
hatoftheHTM
onlydelays
elayofhaving
t,withfigure
odeshaveasc
artofanode,
makeitreadab
R e p o r t
a g e |37
e21,both
ght
9
25
tionit
ming
ers
stthe
shplayer
MLTag
torender
23
calethat
that
le,until

V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |38

Figure22:FlashvisualisationforthetermProject

Figure23:Flashvisualisationfortheterm'demo'
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |39
Whenshowingavisualisationfeaturingbothauserandatag,todisplaywhothatinitialusertalkstoaboutthe
tagisperformedbymovingtheinitialusertotheside,ensuringtheenduserwillunderstandhowthe
informationisbeingfiltered.Forexample,figure24showstheuserswithwhomLisatalksaboutdrinks.

Figure24:Visualisationofbothafixeduserandatag

V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |40
Torepresenttheuserscontactmap,therewasaneedtoloadonlyoneXMLfilefromtheserver,listingall
connections.Thisresultedinadynamicandfastvisualisationrunningfromtheendusersmachine.Tomake
clearwhowasconnectedtowhom,themouseissimplyhoveredoverausertohighlighttheirconnections,as
seeninFigure25highlightingwhoAdamcommunicateswith.

Figure25:Contactmap


V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |41
8.CONCLUSION
In review of this project and its requirements, I believe that the design and implementation of both the
underlying infrastructure for tagging/storage and the resulting visualisations and methods by which they are
generatedhave been extremely successful.As hoped at the outset of this project, there was room to deliver
abovetherequiredexpectations,allowingthedevelopmentofasecondFlashvisualisation,tocomplimentthe
othersproducedbythesystem.
Asexpectedintheearlyplanningstages,therewasasteeplearningcurveindevelopingwithFlash,asIhadno
experience with the development tool. Thankfully there existed many valuable resources (Moock, Essential
ActionScript 3.0, 2007) to aid me in my grasping of the platform, along with many online tutorials (Kirupa,
2007),toaidinthedevelopmentoftheanimationconceptspresentinthevisualisations.
A number of unexpected issues arose throughout the projects development stages, such as the need for a
means of normalising the frequencies of terms, to prevent the overpopularisation of one term adversely
affecting the representation of others. Creating an effective technique to achieve a process that would
successfullynormaliseandsmoothnumbersforthisdomaininvolvedagreatdealofresearch,trialanderror.
The XMPP protocol as a whole provides developers with an unlimited scope of possible functionality due to
both its openness and harnessing of XMLs extendable feature set. The availability of the opensource Smack
APIandOpenFireserverheavilyaideddevelopmentofthecustomisedXMPPnetworktodealwithtagging.The
SmackAPIallowedforeasydevelopmentofadvancedfunctionalityfortheXMPPclientbyprovidingarobust
setofbasicXMPPfunctionhandlers.TheOpenFireserverrequiredthedevelopmentofacustomplugintosit
ontopofitscorefunctionality,requiringresearchintotheserversarchitecture,andtakingadvantageofthat
indevelopingtheplugin.
I am extremely pleased with the progress made in developing the two halves of this system, specifically the
XMPP network capable of tagging conversations, and the visualisation of data from that network. The chat
client provides the basic functionality of any IM application, while seamlessly tying in the ability to tag
conversations. The method by which conversations are tagged is both unobtrusive and an exercise in social
organisationofinformation.
The XMPP server provides the capability of handling tags without any implication on the servers other core
activities.
BydevelopingthevisualisationsystemontopofthePHPframework,andharnessingthepowerofFlash,the
systemistrulyportable.Thisprovidesthebasisforasystemthatisbothscalableandinteroperablewithinany
network, having no implication upon where the information is hosted or displayed. I also feel the resulting
visualisations have exceeded my own expectations with the project, being unsure as to the level of detail I
wouldbeabletoconveyandinharnessinganappropriatestyleofpresentation,havingnopreviousexperience
withFlash.
Onawholetheprojecthasachievedandfurthermorereachedbeyonditsrequirements;withthevisualisations
developedprovidingsmoothandclearbrowsingofthesystemsvastcollectionofrelationaldata,allappliedon
topofanXMPPchatnetworkthatcapturesdatainaneffectiveandsocialmanner.
Extrafunctionality
Asdiscussedinchapter4,thedatabaseisthecoreassetwithinthissystem,allowingfortheeffectiveandeasy
retrievalofinformationthroughappropriatequerying.Thedesignofthedatabasewasspecificallytailoredfor
futureexpansion.WiththemainstoresofinformationfromtheXMPPchatnetworkresidinginthetwotables
tagsandconversations,thetableusersallowsformuchmoreexpansionandvariedapproachestodisplaying
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |42
thedata.Bysimplystoringfurtherinformationoneachuser,suchastheirlocation,positionwithinthefirmor
age,the visualisations can be adapted to show topics ofconversation based on geographical location, age or
department etc. This can all be achieved by adapting the SQL queries involved in retrieving data from the
database,withoutanyneedtoalterthevisualisationsystemsthemselves.Beinginput/outputstylemachines,
thevisualisationsystemswouldsimplybepasseddepartmentnamesandtagsasopposedtousernamesand
tags.
Conversely the flexibility of the database allows for any queries relating to tagged conversations to be
performed.Suchabilitywouldallowforthedevelopmentandconstructionofnewvisualisationsbasedonthis
information. For instance, the current flash visualisations could be further adapted with a scroller to show
informationoverdifferenttimeperiods,allowingtheusermorecontroloverthedisplayofpastdata.
ThePHPserverlayeralsoallowsfortheinformationgeneratedbythesystemtobecomeportable,throughthe
generation of XML, allowing visualisations to be developed on any platform with the ability to parse XML
documents.
Futureusesofthesystem
Othersuchbenefitsofthesystemidentifiedthroughitsdevelopmentarethatitlendsitselftobeadaptedover
anynetworkofpeople,wheredatabasedcommunicationexists.
Theconceptofthissystemcanbemodifiedonlyslightlytooeasilyfitintothesocialnetworkingsiteparadigm,
orevenemailsystems.Withthemodularityofthissystem,theonlychangeneededwouldbetoconnectand
adapt the network in question to feed appropriately into this systems database. Further areas where this
concept of knowledge and people visualisation can be applied, delivering value for end users, would be the
educationalenvironmentssuchasuniversitiesandalsomanysearchandIRdomains.
Similarlytheconceptoftaggingcanalsobefurtheradaptedacrosstheseparadigmsbetheycapableoftagging
ornot,throughautomatedexaminationofthecontentofcommunicationsentacrossthenetwork.Whilstthis
raisesanumberofethicalquestionssurroundingprivacy,itisaprocessbywhichthemostdatawouldbe
minedfromthenetwork,removingtheneededforhumanspecifiedcategoriessuchastagsandstrengthening
anorganisationsmapofknowledge.Anotherquestionbornoutofthisconceptwouldthenbethatofwhether
asweseemoreadvancedVOIPnetworksdevelop,wouldwealsoseemorevoicerecognitionsoftware
developed,andcouldthisbeusedtoautomaticallyidentifytopicsofconversationandthenfedintothismap
ofknowledgeacrossasystem?

Tosummate,thishasbeenanenlighteningandcreativelyexcitingfinalyearprojectthathashighlightedthe
usesandbenefitsofidentifyingandvisualisingknowledgenetworkshiddenamongstsocialcommunication
networks.


V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |43
REFERENCES
Adobe.(2007).AdobeFlashCS3:Features.RetrievedOctober2007,fromAdobe:
http://www.adobe.com/products/flash/features/
AmericaOnline,Inc.(2005).AOL'sThirdAnnualInstantMessengerTrendsSurvey.RetrievedMarch16,2008,
fromAIM:http://www.aim.com/survey/
Brignell,J.(2003).Smoothingofdata.RetrievedJanuary2008,fromNumberWatch:
http://www.numberwatch.co.uk/smoothing_of_data.htm
Chen,C.(1999).InformationVisualisationandVirtualEnvironments.London:Springer.
CiscoSystems,Inc.(2007).CiscoUnifiedPresence.RetrievedOctober2007,fromCisco:
http://www.cisco.com/en/US/products/ps6837/index.html
ExpertsExchange.(2007).Normalizeasetofnumbers.RetrievedDecember2007,fromExpertsExchange:
http://www.expertsexchange.com/Other/Math_Science/Q_22802492.html
Faust,K.(2005).UsingCorrespondanceAnalysisforJointDisplaysofAffiliationNetworks.InP.J.Carrington,J.
Scott,&S.Wasserman,ModelsandMethodsinSocialNetworkAnalysis(pp.117147).Cambridge:Cambridge
UniversityPress.
Ferenc,J.(2006).PHPTagCloudTutorial.RetrievedJanuary2008,fromPrismPerfect:http://prism
perfect.net/archive/phptagcloudtutorial/
Freeman,L.C.(2005).GraphicTechniquesforExploringSocialNetworkData.InP.J.Carrington,J.Scott,&S.
Wasserman,ModelsandMethodsinSocialNetworkAnalysis(pp.248269).Cambridge:CambridgeUniversity
Press.
Gilchrist,A.(2004).TheWellConnectedCommunity.Bristol:ThePolicyPress.
Harrington,J.(1991).OrganizationalStructureandInformationTechnology.Hertfordshire:PrenticeHall.
Herbsleb,J.D.,Atkins,D.L.,Boyer,D.G.,Handel,M.,&Finholt,T.A.(2002).Introducinginstantmessagingand
chatintheworkplace.ConferenceonHumanFactorsinComputingSystems,171178.
JabberInc.(2004).CaseStudy:ALeadingILEC/ISP.Denver,CO:Jabber,Inc.
JabberSoftwareFoundation.(2004).XMPP:Core.RetrievedSeptember2007,fromXMPPRFCs:
http://www.xmpp.org/rfcs/rfc3920.html
JabberSoftwareFoundation.(2004).XMPP:InstantMessagingandPresence.RetrievedSeptember2007,from
XMPPRFCs:http://www.xmpp.org/rfcs/rfc3921.html
JiveSoftware.(2007).OpenfireServer.RetrievedNovember2007,fromIgniteRealtime:
http://www.igniterealtime.org/projects/openfire/index.jsp
JiveSoftware.(2007).Openfire:PluginDeveloperGuide.RetrievedDecember2007,fromOpenfire
Documentation:http://www.igniterealtime.org/builds/openfire/docs/latest/documentation/plugindev
guide.html
JiveSoftware.(2006).SmackAPI.RetrievedOctober2007,fromIgniteRealtime:
http://www.igniterealtime.org/projects/smack/index.jsp
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |44
Kerman,P.(2003).SamsteachyourselfMacromediaFlashMX2004in24Hours.Indianapolis,Ind.:Sams.
Kirupa.(2007).AnimatingDynamicMovieClipsinAS3.RetrievedJanuary2008,fromKirupa:
http://www.kirupa.com/developer/flashcs3/animating_dynamic_movieclips_AS3_pg1.htm
Kirupa.(2007).UsingXMLinFlashCS3/AS3.RetrievedJanuary2008,fromKirupa:
http://www.kirupa.com/developer/flashcs3/using_xml_as3_pg1.htm
Lima,M.(2007).Avisualexplorationonmappingcomplexnetworks.RetrievedOctober2007,fromVisual
Complexity:http://www.visualcomplexity.com/vc/
Maruyama,H.,Tamura,K.,&Uramoto,N.(1999).XMLandJava:developingWebapplications.Reading,
Mass.:AddisonWesley.
Maslow,A.H.(1894).ATheoryofHumanMotivation,PsychologicalReview.PsychologicalReview.
Mayo,E.(1949).HawthorneandtheWesternElectricCompany,TheSocialProblemsofanIndustrial
Civilisation.London:Routledge.
Microsoft.(2007).SilverlightWhitepapers.RetrievedOctober2007,fromSilverlight:
http://silverlight.net/learn/whitepapers.aspx
Moock,C.(2003).ActionScriptforFlashMX:thedefinitiveguide(2nded.).Sebastopol,CA.:OReilly.
Moock,C.(2007).EssentialActionScript3.0.Sebastopol,CA.:O'Reilly.
Moreno,J.L.(1932).Applicationofthegroupmethodtoclassification.NewYork:NationalCommitteeon
PrisonsandPrisonLabour.
Muller,M.J.,Raven,M.E.,Kogan,S.,Millen,D.R.,&Carey,K.(2003).Introducingchatintobusiness
organizations:towardaninstantmessagingmaturitymodel.ConferenceonSupportingGroupWork,5057.
Quinlivan,J.(2007).SoftwaretoVisualisetheConnectionsbetweenIndividualsofanOnlineSocialNetworking
Website.Unpublishedmaster'sthesis,NationalUniversityofIreland,Galway.
TagCrowd.(2007).MakeyourownTagCloud.RetrievedOctober2007,fromTagCrowd:http://tagcrowd.com/
Technorati,Inc.(2008).TechnoratiHelp:Tags.RetrievedMarch20,2008,fromTechnorati:
http://technorati.com/help/tags.html
Teknomo,K.(2007,July).Normalization.RetrievedNovember2007,fromSimilarityMeasurement:
http://people.revoledu.com/kardi/tutorial/Similarity/Normalization.html
TrampolineSystems.(2005).TrampolineEnronExplorer.RetrievedOctober2007,fromTrampolineSystems:
http://enron.trampolinesystems.com/
VanguardSoftwareCorporation.(2008).DataSmoothing.RetrievedFebruary2008,fromVanguardSoftware
Corporation:http://www.vanguardsw.com/DpHelp4/dph00109.htm
Warden,J.(2005).AS3Chronicles.RetrievedFebruary2008,fromFlexandFlashDeveloper:
http://jessewarden.com/2005/10/as3chronicles1simpledrawingexample.html
XMPPStandardsFoundation.(2007,January).HistoryofXMPP.RetrievedOctober1,2007,fromXMPP:
http://www.xmpp.org/about/history.shtml
Young,M.J.(2002).XMLStepByStep(2nded.).Redmond,Wash.:MicrosoftPress.
V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |45

APPEN
Thisisas
thisproje

DIXA
sampleTagCl
ect,usingallw
oudproduced
wordsastags.
V i s u a l
dusingaweb

l i s i n g S o
tool(TagCrow
o c i a l N e t
wd,2007).Itv
t w o r k s |
visualisesthe
F i n a l R
P a
contentofth
R e p o r t
a g e |46
ePDDfor

V i s u a l i s i n g S o c i a l N e t w o r k s | F i n a l R e p o r t
P a g e |47
APPENDIXB
Anexampleofthevisualisationsystemintegratedintoacorporatenetwork.

Anda mungkin juga menyukai