Anda di halaman 1dari 4

Logical Architecture

Community
Profile CMS Global Applications
service UI E.g. Ency, dashboards
and others

App Services

Community Database
Search/query Interface (VQL)

Language Content Indexes Vertical


(IDX2)
services Optimization

I/O Interface

Data Services

Library Service Messaging Global Search


manager Intelligence & Network Repository Appliance
Management - Sub of Library
Manager
(NOC)

1 © 2007 Openwater, Inc. all rights reserved


Machine Tagging..community database

Openwater Index

Search Index

Add information to the index and maintain the link

2 © 2007 Openwater, Inc. all rights reserved


Machine Tagging

• Using Open Source Natural Language Processing (GATE) and Machine


Learning (SVM) to extract information from unstructured information:
•Most important concepts for the network
•Acronyms and their relation to terms
•Text Classification (spam, is form post question/answer/me too/...)
• Takes inputs from structured information
•CMS provides names of people, email adress, user name
•HTML title/heading/.. tags indicate higher chance of concept
•Network classes representing Product Taxonomy
• Feedback loop through user feedback
•How often do users view/link to/promote to topic extracted concepts
•How good is the overlap between concepts and search strings
•Users correct spam/text classification

© 2007 Openwater, Inc. all rights reserved


Forum Benchmark

• Scope is 800 posts in JI


Forum
• Scoring rewards correct
topic high in results list,
answers over questions
• Currently understands
Jive, Joomla!, vBulletin
forums

JasperIntelligence Benchmark results (6 posts per page)


Relative
Each subject as a query Score Comments
Standard OPIC 0.685 baseline would be much worse at 1/page • Similar trends on Ingres
forum link structure 0.942 37% better and Oracle forums
post type classification 0.984 44% better
• Just the start, rapid im-
Subjects with acronyms provements in acronym
spelled out queries (novice) 0.166 baseline
acronyms added back in 0.753 450% better detection & classification

© 2007 Openwater, Inc. all rights reserved

Anda mungkin juga menyukai