An Assessment of Computer Awareness and Literacy among Entry-Level University of Colombo Undergraduates: A Case Study

— As the demand for the computer literate is increasing at a rapid pace, possessing of computer skills is an important asset for a university student. Thus having a good computer knowledge improves the quality of their study programs. This paper discusses a case where information was collected through a survey to assess the computer knowledge of entering freshmen in five Faculties (Science, Arts, Management & Finance, Law and Medicine) of the University of Colombo, Sri Lanka. A survey was conducted among 300 new entrants of the above Faculties. A descriptive analysis was used to identify the patterns of computer usage. It is found that from the entry-level University of Colombo undergraduates, majority of students have used a computer (93%) and/or Internet (60%). Moreover, 60% are computer aware while only 47% are computer literate. It must be noted that males in general outperformed females in computer awareness, computer literacy and Internet usage. Since Chi-Square test confirmed that the two variables, computer awareness and computer literacy are associated with each other, rather than considering these two variables separately, considering them jointly is or was??? expected to yield better results. Hence, the two variables are combined into one, with 4 levels and a generalized logit model isor was??? fitted to this nominal multi-category variable to find the factors affecting computer awareness and/or computer literacy. The model suggests that the new variable is dependent on the factors: usage of Internet, monthly family income level, methods of obtaining computer knowledge, and locations of using a computer.


I. INTRODUCTION
echnology plays an important role in accelerating economic growth and promoting development. Perhaps no other single technological innovation during the second half of the 21st century that has touched so many lives, than the computer [1]. Manuscript  With the increased use of information and communication technologies in education, students entering university need basic computer skills. As students come from different socio-economic backgrounds, they have different learning experiences, capabilities, and needs. It is rarely the case that computer skills of university freshmen are at the same level.
Computer literacy is a mixture of awareness (eg: awareness of the computer"s importance), knowledge (eg: knowledge of what computers are and how they work), and interaction (ability to interact with computers). This view is embraced by [2], [3], and [4]. The perspective of computer literacy in [5], involves conceptual knowledge related to basic terminology (including social, ethical, legal, and global issues) and skills necessary to perform tasks in word processing, database, spreadsheets, presentation, graphics, and basic operating system functions.
A search for finding out the factors affecting computer awareness and computer literacy by modeling responses with suitable models did not produce favorable results. However, the review revealed a small number of studies of similar nature. One study was conducted by the Temple University [6] with 259 entry level students to determine their computer literacy at the beginning of the 1997-98 academic years. The study used a questionnaire and revealed the following: At least 60% of the entering students had access to computers. Approximately 60% of the students had used email, online information services, or the World Wide Web. Students used word processing most frequently and database management systems software least often. In [7], results from a computer concepts assessment given to students enrolled in a computer literacy course at Midwestern Regional University were reported. Slightly over 75% of these students scored more than the minimum college entrance acceptable score on word processing and 63.55% for presentation skills but only 40% for database. Only 6.13% of these students had college entrance scores that exceeded the minimum considered to be acceptable for all components of the test, which also included a wide range of additional topics (networking and the Internet, social and ethical issues, presentation, graphics, operating systems/hardware, word processing, database, and spreadsheets). Besides, all students had a vague idea of June 2011 selected computer terms, with some variation by discipline.
Another study [8], was a written questionnaire for incoming medical students at the School of Medicine Virginia Commonwealth University from 1991 to 2000. The survey's purpose was to learn the students' levels of knowledge, skill, and experience with computer technology to guide instructional services and facilities. The questionnaire was administered during incoming medical students" orientation or mailed to students' homes after matriculation. The average survey response rate was 81% from an average of 177 students. Six major changes were introduced based on information collected from the surveys and advances in technology: distribute CD-ROMs containing required computer-based instructional programs, delivery of evaluation instruments via the Internet, modification of the lab to PC-based environment, development of an electronic curriculum website, development of computerized examinations to prepare them for the computerized national board examinations, and initiation of a Personal Digital Assistant (PDA) project.
This paper is based on a survey to assess the computer knowledge of entering freshmen in five Faculties of the University of Colombo, in order to determine if incoming students possess the basic computer knowledge and skills to begin studies effectively. The survey also identifies the factors affecting new entrants" computer awareness and computer literacy. Thus, this study provides the computer usage of freshmen in different aspects but does not focus on the knowledge of the undergraduates studying from second year to fourth year. The survey was carried out within freshmen"s first three months of the entry to the university in the year 2009 and a sample of 300 was selected using Stratified random sampling. They were interviewed using face-to-face interviewing mode using a questionnaire containing questions of the type single-choice, multiplechoice, 1-3 ranking (1-highest to 3-lowest), and 5-point Likert-scale (1-strongly disagree to 5-strongly agree). Initially a descriptive analysis was done followed by an advanced analysis that resulted in fitting a generalized logit model.

A. Sampling Procedure
Sampling procedure for our study contains two steps in selecting the sample size. Firstly using the stratified random sampling [9][10], different strata were identified and secondly using proportional allocation method [9][10], the sample size for each stratum was calculated. In this study, five Faculties were considered as five strata and then the sample size within each stratum was divided considering gender frequency. Then, total sample size 300 was divided among the ten strata such that the sample size is proportional to the population size. A benefit of stratified sampling is that estimates will be more precise since each stratum is more homogeneous (less variable) than the population as a whole. Table I displays the sample size  allocated for each stratum.   TABLE I  SAMPLE SIZE FOR EACH STRATUM USING  PROPORTIONAL ALLOCATION METHOD In this sampling method, the simple random sampling [9][10] was performed independently within each stratum. According to the Table I, a simple random sample of 39 Male students had to be taken from Science Faculty for the data collection. Likewise, 10 simple random samples of different sizes were required. Since we use the proportional allocation method, the sample sizes are calculated proportionately to the population size, different sample sizes were obtained. If the strata are about the same size then it will be more convenient to take the same sample size in each stratum.
In this study, a particular procedure was used to make certain, the samples are random and without bias. At first, contact details of male and female students of the five Faculties were obtained and then the specified number of students from each stratum was selected using a random number table, and only those selected, were contacted later for an interview. If a particular student was unwilling to respond, another student was contacted. Interviews were through direct face-to-face interaction to minimize possible questions and confusions that may arise during the process of answering the questionnaire. However, the respondent filled the questionnaire while the interviewer helped them with clarifications when necessary. At the time of collecting questionnaires, it was ensured that all required fields were answered.

B. Computer Awareness
A few definitions are available in literature for the term "computer awareness" worldwide [1] [11]. In [1], if a person has heard of at least one of the uses of a computer (eg: playing games to complicated aeronautic applications), then he/she is considered a person with computer awareness.
Another study [11], has used five pointers: short history of computers, short history of Internet, ways computers are used in the society, occupations related to computer use, and computer ethics, in order to measure the awareness. In this study, five indicators were created with the help of the above two studies. Five pointers of the study [11] were It is understood that in measuring the ability to achieve several functions, an assessment is the best option. However due to lack of facilities to conduct an assessment for each and every respondent, (for instance, the field work of the main survey was conducted by only one interviewer in such a way to minimize the interviewer bias and other errors; limited time frame existed to conduct the survey as the assessment is infeasible since it is a time consuming tool; and respondents" unwillingness to grant more time even at the completion of the questionnaire may affect the possibility of performing an assessment successfully), they were interviewed through direct face-to-face interaction in order to minimize the possible questions and confusions during the cause of answering the questionnaire, thus to reduce the gap between the results through an assessment and a questionnaire.

D. Chi-Square Test
The Chi-Square test [14] [15] provides a method for testing the association between two categorical variables in a twoway An example of a Chi-Square test for a two-way table is given below with the objective of studying the association between smoking habit and heart attack.
Since degrees of freedom = (r-1)(c-1) = (2-1)(2-1) = 1, equation (3)   This X 2 value is compared with which is taken from Chi-Square tables at 5% significance level. Since X 2 (7.97) > (3.84), we reject H 0 at 5% level and conclude that there is an association between smoking habit and heart attack. In practice, statistical software is used to perform a Chi-Square test, and in this study, SPSS ® [19] statistical software was used.

E. Generalized Logit Model
Generalized Logit Model [20] [21] [22] is used when the variable is nominal multi-category [16] (two or more categories but which do not have an intrinsic order). In this study, Chi-Square test resulted that the variables computer awareness and computer literacy are associated with each other. Hence, it would not be possible to consider them separately. Thus, by joining these two variables, a new variable with four categories was constructed such that these categories are not ordered or ranked. Thus, a generalized logit model was fitted to this new variable with the objective of finding the factors affecting it.
Suppose there is a nominal [16] variable with I categories.
In fitting the generalized logit model, it is needed to take one of the categories as the "baseline" so that other categories can be compared according to it. In usual practice, the last category is taken as the baseline, as the comparison will be more meaningful. Thus, when the last category (I) is the baseline with a factor x, the generalized logit model is, where is the probability of occurrence of the response of interest (conditional probability) of the i th level of factor x; is the intercept of the i th level of factor x; and is the parameter estimate of the i th level of factor x. Model (4) indicates that the factor x affects the nominal variable.
After fitting a generalized logit model, the next step will be to compute the conditional probabilities { } so that the vital conclusions can be obtained after examining them. If a model contains I categories, then it has (I-1) logits. Hence, the model (4) consists of (I-1) logits * ( ) ( ) , … , ( )+ and using parameter estimates, the values of these logits can be calculated. Finally conditional probabilities can be computed since they satisfy the equation ∑

F. Analysis Of Single-Choice And Multiple-Choice Questions
In a single-choice question, there is only one response. In a multiple-choice question, there are a number of responses. These responses are usually marked by a "tick ()". An example for a multiple-choice question is as follows.
Example: Which locations have you used to make use of computers when you enter the university?

G. Analysis Of Ranked Responses
Instead of simply choosing the responses using a tick in a single-choice or multiple-choice question, in this type of a question the responses are ranked. In this study, rankings are in 1-3 scale with "1" for the highest rank and "3" for the lowest rank. An example is given below: Example: What are the three mostly used software packages when you enter the university? (Please rank the three most factors 1-highest … 3lowest) a. Ms Office packages b. Database Management c. Computer Graphics d. Web Designing e. Other (specify) .…………………………………… As the first step of the analysis, frequencies of the three ranks are counted and then multiplied by weights 0.5 or 0.3 or 0.2 such that the highest rank (i.e. rank 1) is multiplied by the highest weight (i.e. 0.5) and so on. In practice, these weights are chosen such that the sum of the weights is equal to one. Then the total score of each factor is calculated. Finally percentage of each factor is obtained from the total score of all factors.

III. RESULTS
A. Descriptive Analysis 1) Computer Usage Majority of students (93%) have used computers when they enter the university. From these respondents, the survey sought to determine the reasons for computer usage. Since frequency of using a computer is a single-choice question and locations of using a computer is a multiple-choice question, they were analyzed according to Section II F. While the other three factors (purposes of using a computer, software packages used and methods of obtaining computer knowledge) use ranked responses, they were analyzed according to Section II G. The results are given in Table V. From few respondents who have not used computers, reasons for not using a computer were obtained. This is a question with ranked responses and was analyzed according to Section II G. According to the results, the majority (35%) of the respondents has indicated that the main reason for not using a computer is not having a computer at home while for 32% of the respondents the main reason is financial constraints. The results of the analysis are given in Table  VI.   TABLE VI  RESULTS OF THE COMPUTER NON-USAGE 2) Computer Awareness and Computer Literacy According to the definitions for the computer awareness and computer literacy used in this study, an attempt was made to find out the percentages of freshmen having computer awareness and computer literacy. As stated in Section II B and Section II C, a person was considered as computer aware if he/she possesses all five indicators of computer awareness and a person was considered as computer literate if he/she possesses all six indicators of computer literacy.

International Journal on Advances in ICT for Emerging Regions 04
June 2011

B. Testing The Association Between Computer Awareness And Computer Literacy
In order to fit suitable models for the two variables computer awareness and computer literacy, it was first needed to find out whether they are associated or not. Chi-Square test [ Thus in order to find the association between these two variables, Chi-Square test was used. The hypothesis of the test is, H 0 : Computer awareness and Computer literacy are not associated vs H 1 : Computer awareness and Computer literacy are associated. Since the degrees of freedom is one ( (r-1)*(c-1) = (2-1)*(2-1) = 1), the Yates Continuity Correction was used (Section II D). Results of the Chi-Square test obtained from SPSS ® [19] statistical software are as follows.

TABLE XI RESULTS OF THE CHI-SQUARE TEST
A Chi-Square value 20.676 with 1 degrees of freedom ( p = .000 < 0.05 ) illustrates that the test is significant (reject H 0 ) at 5% level and there is significant evidence to confirm that the two variables are associated with each other.

C. Fitting A Generalized Logit Model
The above result (Table XI) suggests that computer awareness and computer literacy have to be considered jointly rather than separately. In order to consider them jointly, a new variable was created as follows: Since the four categories of this new variable are not ordered, a generalized logit model was suitable (Refer Section II E). Moreover, the last category (category 4) was taken as the baseline and SAS ® [23] statistical software was used to carry out the model selection.
The forward selection procedure [24] was used in finding the best model. This procedure starts with the null model (intercept term only) and factors that contribute to the new variable are added one at a time and the factor which gives minimum p value is selected. These factors were identified from the questionnaire (Refer Appendix) and they are Gender, District, Family member who does IT related job, Monthly family income, Usage of resources, Purposes of using a computer, Frequency of using a computer, Locations of using a computer, Software packages used, Methods of obtaining computer knowledge, Usage of Internet, Uses of Internet, and Frequency of using Internet. Then, the rest of the factors were added to the selected model (with the most significant factor) and the next most significant factor was selected. This process continues until none of the factors are significant. The final best model was: where uin = usage of Internet, inc = monthly family income level, met = methods of obtaining computer knowledge, and loc = locations of using a computer; i = 1, 2, 3; j = 1 (Internet user), 2 (non-Internet user); k = 1 ( < Rs.15000), 2 (Rs.15000 -Rs.30000), 3 (Rs.30000 -Rs.50000), and 4 ( > Rs.50000); l = 1 (computer courses followed), 2 (school), 3 (self study, family members, another person, other); and m = 1 (one location), 2 (two locations), 3 (three locations), and 4 (more than three locations).
Purposes of using Internet Frequency of using Internet Education & learning activities (37) Several times a week (30) For getting information (25) Rarely (26) Leisure activities (22) Once a week (21) Communication (14) Daily (14) Office work (1) Once a month (9) Self employment (1) Other ( June 2011 International Journal on Advances in ICT for Emerging Regions 04 As there are four levels for the new variable, model (5) consists of three logits (Refer Section II E). In model (5), ( ) ; i = 1, 2, 3, are known as logits. Parameter estimates of these logits are provided in Table XII. After fitting a model, the usual practice is to test adequacy of the model. This aspect of the adequacy of a model is referred to as goodness of fit [24].

Conditional Probabilities
After parameter estimation, conditional probabilities for each generalized logit model were calculated, and the following results were obtained.
Model 1: When i = 1, the model (5)  which models the probability of category 3 of new variable (having only computer literacy) relative to the category 4 of new variable (not having both computer awareness and computer literacy) Then a total of 278 sets of conditional probabilities {p 1 , p 2 , p 3 , p 4 } have to be calculated for the 278 respondents who have used a computer when they enter the university. In order to do this for different j, k, l, and m values, parameter estimates of twelve terms ( ) from Table XII have to be substituted in the above three models. Subsequently, conditional probabilities {p 1 , p 2 , p 3 , p 4 } are found using the constraint ∑ =1.
Using the conditional probabilities, conclusions for respondents of each category of new variable can be derived. In order to describe category 1, the records which have p 1 as the highest conditional probability are taken from the total of 278 records. Then these highest p 1 conditional probabilities have to be arranged in descending order. Some of them are listed in Table XIII. Altogether 163 records were found with highest conditional probability as p 1 . Then, these 163 records were examined to understand about the type of respondents in category 1, and it was found that these records have higher chance of having both computer awareness and computer literacy. Moreover, it is clear that the respondents who are Internet users having higher (Rs.30,000 or greater) monthly family income level and who use three or more locations for computing are more likely to be both computer aware and computer literate. Further, they are likely to obtain computer knowledge from several sources such as computer courses, self study, family members, another person and school.
There were 49 records in which the highest conditional probability is p 2 (category 2 -only having computer awareness). From these records, it can be said that the respondents who are mostly non-Internet users having monthly family income level less than Rs.30,000 and use few (less than 3) locations to use a computer are more likely to be in the computer awareness category only. Additionally, these students obtain their computer knowledge from school and/or by following computer courses.
For category 4 (not having both computer awareness and computer literacy), 66 records were found. From these records, it is evident that the respondents who are mostly non-Internet users having monthly family income level less than Rs.15,000 and use only one location to utilize a computer, seem to be the ones with the highest probability of not having both computer awareness and computer literacy. Moreover, it appears that their main location of obtaining computer knowledge is school.
It is noted from the results that there was no highest conditional probability found for category 3 (having only computer literacy). This result reveals that there is less probability that a person is only computer literate without being aware of computers. A Chi-Square test was used to identify whether there is an association between computer awareness and computer literacy in order to model the effect of the explanatory variables separately on the two binary response variables computer awareness and computer literacy using two logistic models or jointly on the two responses using a generalized logit model. The Chi-Square analysis proved that the two response variables are associated with each other, hence a generalized logit model is fitted for the new variable by combining the levels of the two binary responses. The generalized logit model for the combined responses suggests that the combined response is dependent on the factors usage of Internet, monthly family income level, methods of obtaining computer knowledge, and locations of using a computer.
From the research findings, it was revealed that University of Colombo freshmen who are likely to be both computer aware and computer literate possess several characteristics. These respondents are Internet users, and their monthly family income level is high. Further they use more locations for using a computer and they obtain computer knowledge from several sources such as computer courses, self study, family members, another person and school. In contrast, for the respondents who are likely to be both non-computer aware and computer illiterate, it is the other way round; i.e. most of them are non-Internet users from families having low monthly income. In addition, they choose only one location for using a computer and obtain computer knowledge from few sources such as school and/or computer courses.
The analysis further proved that new entrants of the University of Colombo who are likely to be only computer aware, hold following features. They are mostly non-Internet users from families having medium level of monthly income. Besides, they obtain computer knowledge from few types of sources such as school and/or computer courses and use one or two locations for using a computer. A significant finding was that there are no highest conditional probabilities found for respondents who are likely to be only computer literate. Since these values are probabilities, even though the highest conditional probabilities were not found for this category in this study, one cannot conclude that there is no possibility that a person in general to be only computer literate without being aware of computers.
In concentrate more on improving the computer literacy skill base of students, especially female learners. However, both groups would benefit from further instruction and practical experience in this subject matter. In order to better prepare for university computer modules, the administrative bodies of the university should consider offering practical computer sessions and teach students helpful tips and shortcuts for better computer fluency. Further, administrative bodies can compare these results with results from future classes. Another interesting idea would be to repeat the same survey at the conclusion of the course and compare the pre-and post-results.

Questionnaire Please tick () or rank the appropriate boxes as required and follow the instructions carefully.
Section 1 -Personal Details