Developing a Community-Based Knowledge System : A Case Study using Sri Lankan Agriculture

The Agriculture sector plays a vital role in Sri Lanka’s economy. Not having an agricultural knowledge repository that can be easily accessed by people in agriculture community in Sri Lanka within their own context, is a major problem. As a solution, a large user centred ontology for Sri Lankan farmers was developed to provide required information/knowledge not only in a structured and complete way, but also in a context-specific manner. Since this problem is not only limited to farmers, we extend this for every one working in the agriculture domain. We validate the ontology in terms of accuracy and quality. The online knowledge base based on the ontology with a SPARQL endpoint was created to share and reuse the domain knowledge that can be queried based on user context. A Mobile based application and a Web based application were developed to provide information/knowledge by using this ontology. These applications are also used to evaluate the ontology by getting the feedbacks from users to the knowledge in the ontology. It is very difficult to maintain a large complex ontology. To maintain our ontology, we identified various processes that are required to develop and maintain ontology as a collaborative process. A semi-automatic end-toend ontology management system was developed to manage the developed ontology and the knowledge base. It provides the facilities to reuse, share, modify, extendand prune the ontology components as required. The facilities to capture users’ information needs and search domain information in user context are also included. In this paper, we present a summary of the overall development process of the ontology including the end-to-end ontology management system. Keywords— Agricultural Information/Knowledge, Contextual Information, Knowledge Modeling, Ontology, Ontology


I. INTRODUCTION
Agriculture is an important sector in the Sri Lankan economy.31.8%out of the total population in Sri Lanka engages in agricultural activities [1].People in agriculture domain, need agricultural information and relevant knowledge to make informed decisions and satisfy their information needs.For example, farmers need information on pest and diseases, control methods, seasonal weather, best varieties or cultivars, seeds, fertilizers and pesticides, etc. to manage their farming activities [2], [3].Other stakeholders of the domain such as agricultural instructors, researchers, information specialist, policy makers, etc. need agricultural information to fulfill their information needs.For example, researchers are interested to know the information about how to solve the problems of pest, symptoms of crop diseases, and usage of fertilizer and pesticides for research purposes.Agricultural instructors also need domain-specific information to help farmers in their region.Thus, all the stakeholders in the agriculture community need agricultural information relevant to them to make better decisions, do further research, or analyze the information for future needs and predictions.They can get some of this information from multiple sources such as agricultural websites, agriculture department leaflets and mass media, etc. However the information in the above sources is general, incomplete, heterogeneous, and not structured to meet their needs.They require information within the context of their specific needs in a structured and complete manner.Such information could make a greater impact on their decision-making process [4].
Not having an agricultural knowledge repository that is consistent, well-defined, and provide a representation of the agricultural information and knowledge needed by the farmers within their own context, is a major problem.Moreover, this problem is not only limited to the farmers, it effects every one working in the agriculture domain.
Social Life Networks for the Middle of the Pyramid (www.sln4mop.org) is an International Collaborative research project aiming to develop a mobile based information system to support livelihood activities of people in developing countries [5].The research work presented in this paper is part of the Social Life Network project, aiming to provide agricultural information and knowledge to farmers based on their own context in Sri Lanka using a mobile based information system.This system has now been expanded to include everyone working in the agriculture domain in Sri Lanka through a development of an end-to-end ontology management system via web based interface.
To represent the information in context-specific manner, firstly, we need to identify the users' context (i.e.users' context model).Since the farmers are the main stakeholders in the agriculture community and other stakeholders are willing to help farmers in various manners, we have identified the users' context specific to the farmers in Sri Lanka such as farm environment, types of farmers, farmers' preferences, and farming stages [6].The farming stages that we have identified as relating to our application are Crop Selection, Pre-Sowing, Growing, Harvesting, Post-Harvesting, and Selling [6].
Next we have identified an optimum way to organize the information and knowledge in user context using ontologies.An Ontology provides a structured view of domain knowledge and act as a repository of concepts in the domain [7].The most quoted definition of ontology was proposed by Thomas Gruber as "an ontology is an explicit specification of a conceptualization" [8].Mainly due to the complex nature of the relationships among various concepts, attenuate the incompleteness of the data, and also add semantics and background knowledge about the domain we have selected a logic based ontological approach to create our knowledge repository.
We first developed an ontological approach to represent the necessary agricultural information and relevant knowledge within the user context [6].Using this approach,we designed the ontology to include information needs identified for the first stage of farming life cycle [9].Next we extended the ontology to include events associated with the farming life cycle such as fertilizers, growing problems, and their control methods [10].A revised and enhanced version of the work including the creation of an online knowledge base and an information retrieval interface has been published in [11].In this paper we have presented a summary of the overall development process of the user centered ontology and the end-to-end ontology management system with respect to the domain of agriculture in Sri Lanka.The user centered ontology was implemented using protégé editor (based on OWL 2-DL).A Web-based ontology management system was developed based on the framework explained in [12].
The remainder of the paper is organized as follows.Section 2 summarizes the development process of the ontology.A summary of end-to-end ontology management system is explained in section 3. Finally, section 4 concludes the paper and describes the future directions.

II. ONTOLOGY DEVELOPMENT PROCESS
To clearly identify the process of the ontology development to represent the information in user context, this section (section II) is mainly organized in six (6) categories such as users' information needs, users' information needs in context, representation of contextualized information, generalizing design approach, and validation and evaluation process.The framework we identified to maintain the ontology for our application is described separately in section III.

A. Users' Information Needs
First we have extracted domain specific knowledge using the reliable knowledge sources [2], [3], [13]- [17], by interviewing the farmers as well as other stakeholders in the agriculture community.By analyzing the information gathered from various sources, we have identified what information is required by the users in agriculture domain at various stages to support better decisions, problem solving, and other information needs.As a result of this analysis, information important to users was identified in the form of questions.Some examples are given in Table I.In this study we identified that, farm environment, types of farmers, farmers' preferences, and farming stages (considered as the user context model) are the important factors that need to be considered when delivering agricultural information and knowledge to farmers [6].

B. Users' Information Needs in Context
We identified areas of generic crop knowledge required to answer the users' information needs (see Table I).We have called these broad areas of knowledge as "knowledge modules".The generic crop knowledge consists of modules such as nursery management, harvesting, post-harvesting, growing problems, control methods, fertilizer, environmental factors, crops and basic characteristics of crops, variety, etc.For example, crop module has information about crops and fertilizer module has fertilizer information and knowledge to handle the fertilizer knowledge needed by domain users.Next we identified the relationships among them.The Fig. 1 shows the generic crop knowledge module.This modularization also helps us to reduce the complexity of real-world scenario in the application domain.It is very hard to maintain a large ontology.Furthermore, this modularization assists us to maintain a large ontology by maintaining small blocks in the knowledge module.Harvesting, and Selling).We begin our detail design process with the first question in the list; "What are the suitable crops to grow?" Choosing the best crop for individual situations is difficult since one has to consider many factors such as environmental conditions which can vary based on region and time period, preferences of user, and resources available for them for cultivation.We therefore have reviewed existing literature on crop selection to identify a suitable criterion which can be used to make better decisions.Then we summarized the existing criteria and identified a suitable crop selection criterion for our application based on the requirements of agriculture community in Sri Lanka [11].It includes the environmental conditions, the special characteristics of a crop, user preferences, about what other farmers grow in different regions and its quantities, and the market information.
In a similar way, we identified the criteria for each item in the list of user information requirements.For example, we defined the criteria for applying fertilizers to deliver fertilizer knowledge and for the growing problems and their control methods related to second stage and third stage of the farming life cycle respectively.When applying a fertilizer for a specific crop user needs to know fertilizer quantity and its unit.A fertilizer quantity depends on many factors; especially it depends on the location, water source, soil Ph range, time of application, application method, and fertilizer type.In addition to this information; the cost, the land sized required for particular fertilizer, and other special information need to be considered.Thus fertilizer quantity needs to be specified in relation to all these information.To do that, we introduced a new information module; Fertilizer Event to represent this additional information and new relationships to describe this event.More details about the criteria for applying fertilizers and selecting control methods are explained in [10].A summary of these criterion factors is shown in Table II.The next step is formulation of a set of contextualized or personalized information based on the users' information needs.For this, we had to develop our own approach to formulate the contextualized information.With the help of the domain experts, we first identified the breadth of information required by users.Next based on earlier identified user context we identified the conditions we can use to obtain a subset of information that can satisfy a specific information need of users.Based on this, we expanded the questions in the user information need list to include the user context.
The Fig. 2 shows our basis for formulating contextualized information.The formulation of contextualized information for crop selection depends on multiple criteria such as the users' context, general crop knowledge, crop selection criteria (select a suitable task modeling criterion specific to the question; for example crop selection criterion, fertilizer application criterion, control method selection criterion, and so on) and the users' constraints (conditions).This serves as a basis for formulating information in a user context for our application.

Fig. 2 Basis for Modeling Contextualized Information
Some examples of contextualized information related to each category of crop selection, fertilizer applying, and control method selection are given in Table III identified the user constrains based on the each criterion factor.We therefore need to select suitable information based on the different locations, different seasons, different soil factors, different types of control methods, etc. or combination of these constraints that help to make better decisions.We have identified these different constraints related to this application.For example, we identified the location as a Zone, Agro Zone, Elevation based location, Province, District, and Regional area (see Fig. 3 (a)).The relationships among these are also complex based on the meaning of these terms.For example, Agro Zone is a Zone, Zone is a Location, Variety is a Crop, and the representation of the environmental factor (see Fig. 3 (a)).The definitions of the terms also need to be considered to attenuate the incompleteness of the data (see Fig. 3 (b)).Furthermore, we need to represent semantic meaning of the terms, for example, if Magalle (location) belongs to Galle (location) and Galle belongs to WetZone (location) then Magalle belongs to WetZone (see Fig. 3 (c)).Through this process, we have formulated the contextualized questions covering all constrains relevant to each criteria.We also generalized these questions (see Table III).These are the range of questions that we want to obtain answers by organizing agricultural information and knowledge to query in context using ontology.

C. Representation of Contextualized Information
An ontology provides a structured view of the domain knowledge and act as a repository of concepts in the domain.This structured view is essential to facilitate knowledge sharing, knowledge aggregation, information retrieval, and question answering [7].Mainly due to the complex nature of the relationships among various concepts, attenuate the incompleteness of the data, and also add semantics and background knowledge about the domain (see Fig. 3) we have selected a logic based ontological approach to represent the contextualized information/knowledge (in Table III) that can be used to find a response to queries within a specified context in agriculture domain.
We reviewed ontology development methodologies and techniques to identify a suitable ontology development approach.Grüninger and Fox [19] have published a formal approach to design ontology while providing a framework for evaluating the adequacy of the developed ontology.We therefore selected Grüninger and Fox's methodology, a logic based approach to develop a user centric ontology for agriculture community.
Our ontology creation begins with the definition of a set of users' information needs identified in Table I.We take these information needs as the main motivation scenario of our application to provide information in context.Competency questions (CQs) determine the scope of the ontology and use to identify the contents of the ontology.The ontology should be able to represent the CQs using its terminologies, axioms and definitions.Then, a knowledge base based on the ontology can provide answers to these questions [19].Therefore, formulation of the CQs is a very important step because these questions guide the development of the ontology.In our application, the contextualized information (see Table III) has been used as the CQs to develop the ontology because it satisfies the expressiveness and reasoning requirements of the ontology (see Fig. 3).
The different constraints in the domain are represented using OWL-2 DL (see Fig. 3).Fig. 3 (a) represents the semantic meaning of the concepts using the class hierarchies.
The sub concepts inherits the properties of the parent concepts and then instances of the sub concept act as the instances of the super concept, because of the taxonomic hierarchy (is-a relationship).The definition of the concept, for example DryZone is represented in Fig. 3 (b).The instances need to be classified based on these definitions.The reasoner attached to the protégé tool can be used for this classification.By using the transitive property, the relation belongsTo with respect to the instances of the Location concept is defined and shown in Fig. 3 (c).Based on the existing information, the additional knowledge can be inferred using the composition of relations (e.g. the relation GRANDFATHEROF is composed by the relations FATHEROF and PARENTOF).We used this property to infer the additional knowledge (see Fig. 3 (d)).The object property chain in Protégé tool is used for this representation.Fig. 4 shows the Fertilizer Event represented using Cmap tool.The Cmap (Concept Map) tool is used to view the graphical representation of the ontology for better user understanding [18].The details of modeling the events associated with second and third stages of the farming life cycle and the associated challenges are explained in [10].
The implemented ontology using protégé is available at http://www.sln4mop.org/ontologies/2014/SLN_Ontology.It consists of 90 concepts, 205 object properties, and 45 data properties.Currently it has 23 vegetable crops, 10 fertilizers, 19 growing problems, and 30 control methods.The more details of the ontology development are explained in [11].

D. Generalizing Approach
We have generalized the specific approach that was developed to create the user centered ontology for Social Life Networks.The Fig. 5 shows this generalized approach.According to this approach, we first identify a set of questions (Users' Information Needs) that reflect various motivation scenarios.Next we create a model to represent information in user context.Then we derive the contextualized information incorporating user context and task modeling with generic knowledge module.We refer this contextualized information (refer Table III.) as the informal CQs.These CQs are used to identify the ontology components according to the Grüninger and Fox's methodology to develop the ontology.
Using this framework, we can extend the ontology for different scenario problems.For example, when answering scenario question like "How to control the growing problems such as diseases, weeds, or pests in environmentally safe manner?"we need to take into account suitable criteria for selecting control methods and the users' context with respect to each criterion factor.We can then formulate the contextualized information based on this systematic approach.These questions drive the development of the ontology.By doing so the contextual information/knowledge can be represented by satisfying the user needs.

E. Validation and Evaluation Process
It is very important to check the validity of the ontology.In this study, the correctness of the contents and correctness of the construction of the ontology have been validated.
The content correctness depends on definitions of concepts, relationships between concepts, hierarchical structures, concept properties, and information constraints of the ontology.The Delphi Method is a research technique that is used to obtain the responses to a problem from a group of domain experts [20].We selected the Delphi method to obtain expert advice and responses to check the definitions of concepts, relationships, and data properties; and hierarchical structures.The validation process is mainly done by agricultural experts from different agricultural institute using questionnaires base on the Delphi method.They verify the correctness, relevancy, and consistency of the ontology components and a set of predefined criteria.The modified Delphi method can be adapted to use in face-to-face group meetings, allowing group discussions [21].Since we need to make more dialogues and collaboration among the participants in the Delphi group we arranged a discussion based on the modified Delphi method.For this discussion eleven (11) Agricultural Instructors (AIs) gathered at Lunama Govi Jana Seva Center, Ambalanthota.The main aim of the discussion was to check the criteria relevant to the fertilizer application, growing problems and control methods, etc.The Delphi investigator (one of the authors of this paper) explained the problems in details to get experts' knowledge.Investigator also allowed them to discuss the problems and possible solutions.Based on their responses, comments, and suggestions we make judgments for the design criteria and assumptions we made during the design process.The contents of the ontology have been refined based on domain experts' feedbacks and comments.
One approach for checking the correctness of the construction is to analyze whether the ontology contain anomalies or pitfalls [22].We first identified the common pitfalls before the implementation.Next we identified the types of Ontology Design Patterns (ODPs) that helps to avoid the pitfalls by means of adapting or combining existing ODPs [22].Design patterns are shared guidelines that help to solve design problems, for example Semantic Web Best Practices and Development under W3C [23].We also used the webbased tool called OOPS! [22] to detect potential pitfalls in the ontology.Using above methods we validated the ontology in terms of accuracy and quality.
The implemented ontology using protégé is used to evaluate the ontological commitments internally and also used to test the consistency and inferences using reasoners.We used the CQs to evaluate the ontological commitments to see whether the ontology meets the users' requirements using Description Logic (DL) queries and SPARQL queries [11].
Next we checked the user satisfaction to the information/knowledge in the ontology.We used a mobile based application for this evaluation.A Mobile based application was developed to provide information by using this ontology [24].The first evaluation was done only for crop selection with a group of 32 farmers in Sri Lanka [24].We have gathered suggestions from farmers and other stakeholders of the domain for our future designs.
The Knowledge Base based on the ontology was created by populating the ontology with instances to share and reuse the agricultural information via the Web [11].The online knowledge base can also be used for evaluation process.We can query the contextualized information on the Web via this application (SPARQL endpoint) using SPARQL queries (refer http://webe2.scem.uws.edu.au/arc2/select.php).This application specially is useful for agricultural instructors, researchers, and people at the Department of Agriculture to find information based on their needs.For example, the following SPARQL query lists the suitable environmentally safe control methods to control Bacterial wilt disease for Brinjal crop?We evaluated the knowledge represented in the ontology by evaluating outputs of the queries.The output of the following query is shown in Fig. 6.Fig. 6 The output of the above query III.ONTOLOGY MANAGEMENT SYSTEM (OMS) If a developed ontology is not up-to-date or the annotation of knowledge resources is inconsistent, redundant or incomplete, then the reliability, accuracy, and effectiveness of the ontology based systems decrease significantly [25].Ontology building is a significant challenge for a number of reasons, for example it takes a considerable amount of time and effort to construct an ontology, it requires a sophisticated understanding of the subject domain, and also it is even greater challenge if the ontology developer or engineer is not familiar with the domain of interest.Due to the increase in volume of information, capturing the information, maintaining it and making it usable is a challenge.Therefore it is very important to be able to practically maintain a developed ontology by updating the content of the ontology in a timely manner, for example, extending the ontological structure by improving coverage and modifying the instances (individuals) in the knowledge base.
After developing the ontology we had to devise a method to maintain it.A community based facility to manage the structure of the developed ontology in the long term as well as further populate the knowledge base is very useful.For this we have developed an end-to-end semi-automatic collaborative ontology management system for large-scale development and maintenance purposes by giving facilities to reuse, modify, extend, and prune the ontology components as required.It also has facilities to capture users' information needs in their context, as well as search domain information in user context.We use a web based application to deploy the proposed framework.With the help of this web based ontology management system, the people with little knowledge about the ontology can help to modify the ontology, and use the ontological information and knowledge for their needs.The Fig. 7 shows the proposed framework for an end-toend ontology management system.The full details of the design of the framework and development of the end-to-end ontology management system based on the framework is explained in [12].In this paper we briefly present the processes belonging to this framework.This framework mainly has four processes such as Populate the ontology, Modify the ontology, Search domain information in context, and Capture users' information needs and related users' context for community based ontology development and maintenance.This framework provides the essential facilities to manage the ontology life cycle by supporting the identified processes.Each process is briefly mentioned below.

A. Populate the Ontology
Using this process we can get the support from the agriculture community to fully populate the knowledge base in the long term.To populate, we specially get the involvement of the people in the domain, for example, domain experts such as agricultural instructors, information specialist and researchers in agriculture community.To fully populate the ontology with the real data, we develop a semiautomated system to capture this information using web based application.For that we have used a framework called "CBEADs": Component Based Ebusiness Application Development and Deployment Shell [26] as a data capturing application.This framework which is created using PHP and MySQL has the potential to evolve with changing requirements.More details related to each process can be found in [12].The Form as shown in Fig. 8 is used to gather required data using the CBEAD application.

B. Modify the Ontology
This process helps us to extend and prune the ontology based on the changing and/or expanding user requirements and related user contexts.This process can be performed by agriculture domain experts and ontology developers.Since the process to modify the structure of the ontology is complex we need to mange this process carefully.This has three processes; insertion, deletion, and updating (change).Each process has three main activities.For example in insertion process it needs to consider Inserting Concepts, Inserting Data properties and Inserting Object properties.In the same manner these activities can be seen in deletion and updating processes.In this model, seven steps have been proposed to modify the ontology such as view the ontology structure (initial structure) represented in Cmap tool; extract domain terms, concepts, and basic hierarchies using Text-To-ONTO tool; view ontology design framework used to represent the information in user context; based on the design framework modify the structure using metadata (metadata provides the information to users how to modify the structure of the ontology, for example, how to insert the concepts, how to delete the data properties or object properties, etc.); validate the modified content using web forms; convert modified content into RDF or OWL format; and finally import modified content into initial ontology for information integration.The way of modifying the ontology related to this application is outside the scope of this paper and it is explain in [12].

C. Search Domain Information in Context
To get the benefits from the knowledge base for all the stakeholders in the community by finding the right information based on their context we have included the process "search domain information in context".Through this system we provide two facilities.Especially normal users such as farmers can view the domain information in their context and other stakeholders in the domain especially agricultural instructors and researchers can retrieve domain information and knowledge based on their interest.For the farmers, we have provided specific answers to their questions in their context using a natural language (in English, Sinhala, and Tamil).Fig. 9 shows a user friendly interface for searching information.

D. Capture User Information Needs in Context
The process "capture the user information needs and related user context" collects the information required to extend the ontology further.Since to get the benefits to a broad audience is even more challenging task, this collaborative end-to-end ontology management system via web based interface has now been expanded to include their requirements in context.Then we can extend our ontology with the different motivation scenarios that provide even richer knowledge environment to support the agriculture community.
Since this is a collaborative approach, the system mostly relies on the users of the domain, their participation to the system, and developers' and administrators' skills in overseeing the collaborative processes.In our system (refer http://webe2.scem.uws.edu.au/oms/index.php),there are three main user categories (e.g.domain experts, normal users and ontology developers) with different access rights.Fig. 10 shows the home page of the OMS (in English Language).The domain experts and ontology developers need to be logged-in to the system for populating and modifying the ontology.Domain experts and ontology developers can change or extend the ontology by getting the requirements and user constraints from the system.There are processes to capture user information needs and related user context from the users to represent domain information in context.Domain experts also involve populating the ontology by capturing instance values through the forms.Through this system all the stakeholders of the community can search information by viewing user friendly interfaces (for the normal users such as farmers) and/or querying the SPARQL endpoint in context (for the advanced users).We have developed this web based application in English but the English is not the official language of Sri Lanka.Sri Lankan people mainly use their native languages such as Sinhala and Tamil.We therefore give the facility to use this application in their native languages.
Fig. 11 shows the overall development process of a community based ontology by summarizing above two sections (II and III).This is an iterative process.Based on the results and feedbacks of the validation and evaluation processes the design of the ontology is refined using the design framework shown in Fig. 5. Then the ontology can be maintained using the web based ontology management system based on the framework represented in Fig. 7. IV.CONCLUSIONS Agriculture is the most important sector in Sri Lankan economy.The people in agriculture domain in Sri Lanka need agricultural information and relevant knowledge to make optimal decisions for successful farming and/or do research for development of the agriculture sector and enhancement of the farming industry.Since not having agricultural knowledge repositories that can be easily accessed by people in agriculture community within their own context is a major problem, a user centric knowledge environment has been developed as a solution.
Through this study, we first identified the user context model related to the farmers in Sri Lanka.Next we developed a logic based ontological approach to meet the information needs to suite the identified context.We have achieved this by modifying how contextualized information is formulated in a well-established methodology.
This article presents a summary of the overall ontology development process to organize domain knowledge by meeting particular access requirements effectively using the guidelines shown in Fig. 11.We validated the ontology in terms of accuracy and quality by using Delphi and modified Delphi methods; a web-based tool; and ODPs.We evaluated the ontology against the user requirements by using mobile based and web based applications.The online knowledge base with a SPARQL end-point to share and reuse the domain knowledge was created.To fully populate the knowledge base as well as modify the ontology by extending coverage of the domain we developed a semi-automatic endto-end ontology management system that help us to develop and manage complex real-world application based ontologies in the long term as a collaborative process.Therefore this OMS is a community activity.
We received very valuable feedbacks from the domain experts during the group discussions in the modified Delphi method as well as from and the field trials.Based on these feedbacks we are now refining our application.

Fig. 1
Fig. 1 Generic Crop Knowledge ModuleWe organized the users' list of information requirements according to the farming life cycle stages (6 stages -Crop Selection, Pre-Sowing, Growing, Harvesting, Post-

Fig. 3
Fig. 3 Representation of different constraints

Fig. 7 A
Fig. 7 A Framework for End-to-End Ontology Management System

Fig. 8
Fig. 8 Data gathering Interface for crop variety

Fig. 9
Fig. 9 Interface for searching information in context

Fig. 11
Fig. 11 Overall Development Process of the Ontology Web based Ontology Management System (End-to-end OMS Design Framework in Fig.

TABLE II SUMMARY
OF THE CRITERION FACTORS FOR CROP SELECTION, FERTILIZER APPLICATION AND CONTROL METHODS . We have