Thursday, May 7, 2009

Clever Retrieval – Semantic Search

1. Concept
Existing information retrieval is an acquainted technology and science familiar to us.
According to Wikipedia definition, it is one of the fields of science that is looking for the contents in a documents or the document itself. Google and Yahoo are some of examples of document retrieval software that we could easily.
Then, what is the Semantic Search?
Again according to Wikipedia definition, Semantic Search is a research field that is going to improve the existing search performance based on information such as XML, RDF, etc. on the semantic network. This is a type of search using semantic information of language of computation on the similarity between search language and document as a context similar to Page Rank of Google, for example.
In fact the term of semantic search is used widely from the technology of NLP to the search technology using semantic technology as the concepts of the term semantic web and ontology appeared.

Consequently, the Semantic Search at this point of time is a new paradigm of search technology field that is being developed toward diversified technologies and service model.

2. Way of Semantic Search Access
As briefly commented above the reason of becoming issue at the point of semantic search introduction is to solve the common concerns of information retrieval business in that the main target is to provide search results matching to the intent of the users through development of search technology of search language similarity through computing the keyword appearance frequency in terms of TF-IDF(Term Frequency Inverse Document Frequency) and through understanding of the meaning of the information.

We would like to review the research and the development directions of semantic search. The semantic search that raised by the appearance of semantic web may be divided by two fields.
The first one is to seek information that semantically tagged instances through appropriate preparation of semantic query language and keyword search by semantic annotation of target document and modeling of domain knowledge (concept and relationship) as ontology languages such as RDF/S and OWL. And the other is to seek semantically tagged information of RDF/s or OWL existing on web. Actually, the former may be developed to the vertical search category in detailed domain or targeted to web category.



[Figure 1, Semantic Tagging of HTML]

Information expressed in HTML can be tagged in each semantic language. Key semantic languages include RDF, RDF/S, OWL, Microformat, and RDFa, etc., and are standardized technologies of W3C. Such semantic tagging is not limited to HTML and pre-defined meaning can be attached using semantic language to various data such as text information, HTML tag of relationship style of database (RDB), And the pre-defined meaning means ontology in the field of semantic web technology. Ontology may be defined independently according to domain and service type of information or already defined ontology including Dublin Core, SIOC, SKOS, FOAF, ResumeRDF, and DOAP can be referred and used.



[Figure 2, Data integration through SPAQL]

Query to information created through semantic tagging is possible to make through the semantic queries like SPARQL (Simple Protocol and RDF Query Language).
SPARQL is a W3C standard technology and is suitable query language to graph structured data such as RDF as an advanced step of DQL and RDQL.



[Figure 3, KERIS Semantic based search by Saltlux]

[Figure 3] shows the first access way case of semantic search applied to semantic search construction of Saltlux for KERRIS through query/search by semantic tagging of RBD data alike Oracle and adopt F-logic rule based reasoning.



[Figure 4, Museum Finland by SECO]

[Figure 4] shows Museum Finland project developed and accomplished by SECO(Semantic Computing Research Group) of Finland for the construction of knowledge base from museum information based on ontology mata data, that is similar to the case of KERRIS by Saltlux shown in Figure 3.

Semantic languages such as RDF and OWL accepted as standard by W3C have a broad and deep knowledge expression level and good to produce machine readable information through reasoning of description logic level.
And it has an advantage of integrated search by ontology and semantic query language. In the contrary it is necessary to adjust the level of expression because the current technology level is not sufficient for the automatic annotation of existing information. However, the research and development activities are in the progress to improve automatic annotation through information retrieval technologies such as text mining and NER(named entity recognition), and authoring tools that ease the semantic tagging to existing and being created documents and technologies such as RDFa and Microformat will surely activate the development of semantic search engine.



[Figure 5, SWSE by DERI]

[Figure 5] shows SWSE search engine developed by DERI Laboratories which searches and navigates the objects expressed by semantic web standard language in the object oriented concept. It supports interface that navigates the information to object units. This concept is not searching the text documents on the web but search concept of RDF resources in object units. SWSE collects billions of RDF documents from Falcon, Swoogle, Waston, and DBpedia data sets with separate unique URI and provides search services. SWSE processes the queries utilizing SPARQL, the W3C standard query language as an interior query engine.



[Figure 6, SWOOGLE by UMBC]

Ontology and semantic language resources on web can be retrieved by UMBC Swoogle in [Figure 6].

Secondly, in natural language search area through NLP it shows Q&A type search system presenting answer by analyzing natural language searched results in sentence type and semantic discovery. However, recent trend is to provide the search results in the balance of http://www.saltlux.com)/keyword search and sentence style search of Powerset (http://www.powerset.com/) as shown in [figure 7] from complex type of natural language query analysis.



[Figure 7, Search by powerset.com]

Sensebot(http://www.sensebot.net) of [Figure 8] is a search engine that provides summarized information of each site or documents as a result of search for the search word, not the web page list showing type of method. It provides summarized information through text mining by using Google type search engine. In this manner, research for semantic search area in view of linguistics utilizing NLP and text mining is being sustainably processed.



[Figure 8, SenseBot Search Engine]

Next, search area through browsing with visual function shows related information through additional information tagging to search index language, and develops toward easy discovery by search user. This area is not called a separate semantic search but the trend of current search area together with the introduction of web 2.0 technology contains many functionalities. Owlim.com shown in [Figure 9] is a service that is utilized by search through automatic creation of relationship between words by using Korean language retrieval of individual and keyword co-occurrence, and it visually expresses the related information and searches the summary of the contents.



[Figure 9, Search by owlim.com by saltlux]



[Figure 10, search by evri.com]

By now we have reviewed academy and industrial approach methods to achieve the objectives of semantic search.

Meaning or semantic based search as it says is a term with wide range of domain and technology and is not easy to make a simple concrete definition.
In this writing our intention was to review areas of ontological search, text mining, and improving the keyword search utilizing semantic technology.

3. Conclusion
Search technology is a key technology of company’s in-house and web information flow. At every second of time keywords are input into numerous search sites and the results flows to the users. Search users are accustomed to the current search technology and they express by themselves the needs. User’s needs are quite diversified such as results meeting with purpose, additional related information, results that are easy to read and discover, time saving, solution of meaning publicity, etc. To meet with these needs research proceeds in the way of semantic search technology area.

As reviewed above cases the current semantic search technology is focused to develop a more advanced technology level using semantic search area including ontology and text mining. This may not be defined as a word of nonobjective semantic search development but R&D trend of communicating knowledge and discover information under the natural information search behavior and to provide better qualitative information to users ultimately.

Great Success , Search & Discovery Seminar of Saltlux

Saltlux Inc. ( www.saltlux.com), one of the world leaders in semantic search market has successfully finished the first seminar of “Search and Discovery that leads the knowledge world” on April 27 this year at the COEX Grand Ballroom in Samsung-dong Seoul, Korea.
In this seminar Saltlux introduced the way to improve the value of the in-house knowledge asset of an enterprise through storage, sharing and utilization of documents to its partner companies and customers.

At the introductory presentation Tony Lee, the president and CEO of Saltlux emphasized, “Even so huge number of documents produced in a corporate production processes and know-hows could not be shared in the organization because of the difficulties to share registration or upload method and worries of problem disclosure due to the egocentric mind for the team and only the final results shared. To resolve this issue an organization needs to have an easy automation system to use with a reasonable introduction costs, easy setting and operation of the system without a skilled IT specialist, and with strong security function and flexible connectivity.”

Saltlux introduced a hardware integrated [IN2]SearchBox that enables a document archive and intellectual search as a way to resolve this issue at this seminar and through which a corporate could improve business productivity and efficiency with a simultaneous saving the cost.

Through use case of [IN2]SearchBox system by a patent agent, a consulting company and a research laboratory of a university, HyungJune Park of Saltlux emphasized, “The needs for search and share are urgent because of the information quantity is tremendously increased by documents electronization. And the past record reference job is also increasing in dealing with the similar documents and project. And [IN2]SearchBox would be the answer for the solution to this issue.”

At the seminar Saltlux also introduced [IN2]Discovery that is being launched in May this year. [IN2]Discovery is a semantic search platform that provides insight through reorganization and analysis of information, and that helps for the decision making through discovery and utilization of hidden information.

Saltlux is planning to hold a seminar on analysis and utilization of information in commemoration of [IN2]Discovery launch under the theme of “Search and Discovery that leads the knowledge world, the second story” late May.

Sunday, April 12, 2009

Saltlux Holds a Seminar On [IN2]SearchBox, a New Hardware Integrated Search Solution

April 12, 2009 •
“On April 27, 2009, come, see and experience an innovative way for sharing and utilization for valuable document and new business opportunity with one SearchBox.”

Does your company well utilize your scattered and hidden in-house valuable documents as knowledge assets?
Are you hesitating to introduce a system because you do not know the solution or the cost?
Saltlux Inc. is going to suggest a cost innovative and a reasonable alternative for the activation of document resources of your company in this economically difficult situation.
The first seminar of “Search & Discovery that leads the knowledge world” will be on “[IN2]SearchBox:-
[IN2]SearchBox : storing, sharing and utilization are settled one for all.”


Date and time:- April 27, 20092009; 14:00 ~ 17:00
Venue:- Grand ballroom #101, Hotel CoEx Intercontinental, Samsung-dong, Seoul
Pre-registration:-First 150 registrants by April 25, 2009 through online will be first served
(Admission is free for the first 150 registrants through online (www.saltlux.com)

Agenda
13:00 ~ 14:00 Registration & Product Exhibition

Fist session: On [IN2]SearchBox, new utilization of documents and information

14:00 ~ 14:20 Opening & Keynote : Saltlux Solution & Product Loadmap
14:20 ~ 14:50 [IN2]SearchBox, an alternative way for activation of knowledge ecosystem
14:50 ~ 15:10 Japanese case study of SearchBox
15:10 ~ 15:30 Domestic case study of SearchBox
15:30 ~ 16:00 Coffee Break & Product Exhibition

Second Session: Search & Semantic utilization Solution

16:00 ~ 16:25 Creation of semantic data and search for economic value improvement
16:25 ~ 16:50 Search & Discovery, a lead for knowledge world
16:50 ~ 17:00 Q&A , Drawing for free gift
17:00 ~ 18:00 Product Exhibition & Network