

STATISTICS CORNER 

Year : 2022  Volume
: 34
 Issue : 1  Page : 7678 

Variables and risk
Smita Narayanan
Additional Professor, Regional Institute of Ophthalmology, Thiruvananthapuram, Kerala, India
Date of Submission  27Jan2022 
Date of Decision  28Jan2022 
Date of Acceptance  28Jan2022 
Date of Web Publication  21Apr2022 
Correspondence Address: Dr. Smita Narayanan Regional Institute of Ophthalmology, Thiruvananthapuram, Kerala India
Source of Support: None, Conflict of Interest: None  Check 
DOI: 10.4103/kjo.kjo_16_22
How to cite this article: Narayanan S. Variables and risk. Kerala J Ophthalmol 2022;34:768 
The basic building blocks of biostatistics are called variables. While critically analyzing a study, the first step is to break down the primary outcome measure to find out which variable is being studied. Then, determine whether it is a continuous or categorical variable.
Categorical Variables   
These are the variables that can be counted and hence do not have a unit. For example, the number of corneal abrasions with an infiltrate, stages of diabetic retinopathy, and the number of cases with qualified success after trabeculectomy.
Continuous Variables   
Those variables that can be measured and hence have a unit are continuous variables.
For example, intraocular pressure in mmHg, retinal nerve fiber layer thickness in μm, and visual acuity in logMAR.
Both these types of variables can be described, analyzed, and presented in two important ways:
Descriptive Statistics   
For comparative studies such as randomized control trials or cohort or case–control studies, the descriptive statistics show how the groups (two or more) are as similar to each other as can be achieved.
It is important to note that randomization does not yield perfectly matched groups. If, in a study, the groups are perfectly matched, then that raises a red flag. The degree of matching between the groups is given by a P value for the variable.
If there is an observable dissimilarity in the distribution of a given variable between groups, it may mean a selection bias or a break in allocation concealment. If such variable(s) is present, look if an explanation has been provided for, in the discussion of the study and decide if it is a satisfactory explanation or not.
In a multicenter study, look whether one or few centers have a big difference of allocation.
The categorical variables are expressed as percentages or proportions. For ease of statistical purposes, the categorical variables are best described in a dichotomous form, for example., disease is present or absent.
Before the continuous variables are described, first check to see if they have a normal distribution or not.
The easiest way is to draw a histogram of the dataset. The normal distribution shows the typical Bell's Curve.
Statistical software utilizes the Kolmogorov–Smirnov test or the Shapiro–Wilk test for best results to determine if the distribution is normal or not.
If the standard deviation (S.D) is less than half of the mean, then the distribution is assumed to be normal.
The continuous variables are described with a measure of central tendency and a measure of spread. If the distribution is normal, then mean and S.D are used to describe the continuous variable. In a nonnormal distribution also called as nonparametric distribution, the median with interquartile range (IQR) is the preferred description. Median is the value in the middle of the distribution with 50% of values higher than it and 50% of values lower than it. The range of values within which 50% of values reside is called as IQR. The lower band of IQR is called as first quartile (Q1) and the upper band of IQR is called as third quartile (Q3) where 25% of the values are higher than Q3. If there are extreme values in a distribution, also called as outliers, they will distort the range and, therefore, the normal distribution. This can also be assessed by looking how close the mean and the median are to each other.
Analytic Statistics   
Analytic statistics describe the analysis of statistics which can be done as:
 Difference between variables
 Correlation between variables
 Association between variables.
Let us look at analytic statistics with respect to a disease or a healthrelated event. The most important term in this regard is a risk.
Risk = No. of people with the given disease or healthrelated event.
Population of people in that place at a given time:
Often, we are interested in comparing the risk between different groups to look for an association between an intervention and an event. We use a 2 × 2 table.
The risk of development of the event in the intervention group
The risk of development of the event in the comparator group
The various analytic statistics that can be used with risk are:
Absolute risk difference
It is the difference between the risk in the intervention group and the risk in the comparator group. ARD = RiRc.
Interpretation
If the risk was equal in both groups, then ARD = 0, and hence, there is no association between intervention and event.
If absolute risk difference >0, then there is a positive association and if
ARD <0, then there is a negative association between the intervention and event.
For example, if ARD = −0.064, then we say that there is 6.4% less chance (negative ARD) of the event occurring with the given intervention.
Number needed treat
It is the reciprocal of ARD. Hence,
Interpretation
Considering our previous example, where ARD = −0.064, the NNT = 1/0.064 = 15.625 (rounded to 16).
This means that we must treat 16 people with the intervention to prevent the occurrence of the event in one person. Hence, NNT is a more clinically relevant parameter.
Relative risk
It is also known as the risk ratio, and it is the ratio of the risk in the treatment group to the risk in the no treatment (comparator) group.
Interpretation
For example, if relative risk (RR) = 0.40, then we may interpret that the risk of the event occurring in the treatment group is 40% of the risk in the no treatment (comparator) group.
Odds ratio
Before we talk of odds ratio (OR), we must know the meaning of odds.
The odds of raining on a given day or the odds of a horse winning a derby are common terms used in this regard.
The OR is the ratio of odds of the event occurring in the intervention group to the odds in the comparator group.
Interpretation of odds ratio or relative risk
If OR or RR = 1, then there is no association between intervention and the event.
If OR or RR >1, then there is a positive association.
If OR or RR <1, then there is a negative association. Hence, if OR = 0.40, then it can be interpreted as the odds of developing the event is 40% less in the intervention group as compared to the comparator group.
Although OR and RR may appear similar, it is not always so. If the event is very common, then OR and RR are vastly different.
Mean difference
In the case of continuous variables, the measure of association is the mean difference. It is the difference between the mean in the treatment group (mean [T/t]) and the mean in the comparator group (mean [Comp]). It is similar to ARD with respect to interpretation.
MD = Mean (T/t)mean (Comp).
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
